Title: | Scientific Data on Time of Lineage Divergence for Your Taxa |
---|---|
Description: | Methods and workflows to get chronograms (i.e., phylogenetic trees with branch lengths proportional to time), using open, peer-reviewed, state-of-the-art scientific data on time of lineage divergence. This package constitutes the main underlying code of the DateLife web service at <https://www.datelife.org>. To obtain a single summary chronogram from a group of relevant chronograms, we implement the Super Distance Matrix (SDM) method described in Criscuolo et al. (2006) <doi:10.1080/10635150600969872>. To find the grove of chronograms with a sufficiently overlapping set of taxa for summarizing, we implement theorem 1.1. from Ané et al. (2009) <doi:10.1007/s00026-009-0017-x>. A given phylogenetic tree can be dated using time of lineage divergence data as secondary calibrations (with caution, see Schenk (2016) <doi:10.1371/journal.pone.0148228>). To obtain and apply secondary calibrations, the package implements the congruification method described in Eastman et al. (2013) <doi:10.1111/2041-210X.12051>. Tree dating can be performed with different methods including BLADJ (Webb et al. (2008) <doi:10.1093/bioinformatics/btn358>), PATHd8 (Britton et al. (2007) <doi:10.1080/10635150701613783>), mrBayes (Huelsenbeck and Ronquist (2001) <doi:10.1093/bioinformatics/17.8.754>), and treePL (Smith and O'Meara (2012) <doi:10.1093/bioinformatics/bts492>). |
Authors: | Brian O'Meara [aut], Jonathan Eastman [aut], Tracy Heath [aut], April Wright [aut], Klaus Schliep [aut], Scott Chamberlain [aut], Peter Midford [aut], Luke Harmon [aut], Joseph Brown [aut], Matt Pennell [aut], Mike Alfaro [aut], Luna L. Sanchez Reyes [aut, cre], Emily Jane McTavish [ctb] |
Maintainer: | Luna L. Sanchez Reyes <[email protected]> |
License: | GPL (>=2) |
Version: | 0.6.9 |
Built: | 2024-09-12 19:22:43 UTC |
Source: | https://github.com/phylotastic/datelife |
.get_ott_lineage
uses rotl::taxonomy_taxon_info()
with include_lineage = TRUE
.Get the lineage of a set of taxa.
.get_ott_lineage
uses rotl::taxonomy_taxon_info()
with include_lineage = TRUE
.
.get_ott_lineage(input_ott_match)
.get_ott_lineage(input_ott_match)
input_ott_match |
An Output of check_ott_input function. |
A taxonomy_taxon_info object
A multiPhylo object with trees resulting from a datelife search of some birds and cats species
birds_and_cats
birds_and_cats
A multiPhylo object
Generated with: taxa <- c("Rhea americana", "Pterocnemia pennata", "Struthio camelus", "Gallus", "Felis") birds_and_cats <- datelife_search(input = taxa, summary_format = "phylo_all", get_spp_from_taxon = TRUE) usethis::use_data(birds_and_cats)
This function implements theorem 1.1 of Ané et al. (2009) doi:10.1007/s00026-009-0017-x to find a grove for a given group of chronograms.
build_grove_list(datelife_result, n = 2)
build_grove_list(datelife_result, n = 2)
datelife_result |
A |
n |
The degree of taxon name overlap among input chronograms. Defaults
to |
A list of vectors; each list element is a grove.
This function implements theorem 1.1 of Ané et al. (2009) doi:10.1007/s00026-009-0017-x to find a grove for a given group of chronograms.
build_grove_matrix(datelife_result, n = 2)
build_grove_matrix(datelife_result, n = 2)
datelife_result |
A |
n |
The degree of taxon name overlap among input chronograms. Defaults
to |
A matrix. Each cell shows whether n-overlap exists between a pair of inputs.
Ané, C., Eulenstein, O., Piaggio-Talice, R., & Sanderson, M. J. (2009). "Groves of phylogenetic trees". Annals of Combinatorics, 13(2), 139-167, doi:10.1007/s00026-009-0017-x.
check_conflicting_calibrations
checks if calibrations are younger or older
relative to descendants and ancestors, respectively.
check_conflicting_calibrations(phy, calibration_distribution)
check_conflicting_calibrations(phy, calibration_distribution)
phy |
A |
calibration_distribution |
A list of node age distributions, named with |
It removes conflicting calibrations if needed, but BLADJ works as long as it has an age for the root.
datelife
functionscheck_ott_input is currently used in functions
get_ott_clade()
,
get_ott_children()
, and get_otol_synthetic_tree()
.
check_ott_input(input = NULL, ott_ids = NULL, ...)
check_ott_input(input = NULL, ott_ids = NULL, ...)
input |
Optional. A character vector of names or a |
ott_ids |
If not NULL, it takes this argument and ignores input. A
numeric vector of ott ids obtained with |
... |
Arguments passed on to
|
By default, it uses the ott_id
argument if it is not NULL.
A named numeric vector of valid Open Tree Taxonomy (OTT) ids.
cluster_patristicmatrix()
obtained
with a particular clustering method, or the next best tree.
If there are no ultrametric trees, it does not force them to be ultrametric.Choose an ultrametric phylo object from cluster_patristicmatrix()
obtained
with a particular clustering method, or the next best tree.
If there are no ultrametric trees, it does not force them to be ultrametric.
choose_cluster(phycluster, clustering_method = "nj")
choose_cluster(phycluster, clustering_method = "nj")
phycluster |
An output from |
clustering_method |
A character vector indicating the method to construct the tree. Options are:
|
A phylo
object or NA
.
This uses the taxize package's wrapper of the Global Names Resolver to get taxonomic paths for the vector of taxa you pass in. Sources is a vector of source labels in order (though it works best if everything uses the same taxonomy, so we recommend doing just one source). You can see options by doing taxize::gnr_datasources(). Our default is Catalogue of Life
classification_paths_from_taxonomy(taxa, sources = "Catalogue of Life")
classification_paths_from_taxonomy(taxa, sources = "Catalogue of Life")
taxa |
Vector of taxon names |
sources |
Vector of names of preferred sources; see taxize::gnr_datasources(). Currently supports 100 taxonomic resources, see details. |
Taxonomies supported by taxize::gnr_datasources()
Catalogue of Life
Wikispecies
ITIS
NCBI
Index Fungorum
GRIN Taxonomy for Plants
Union 4
The Interim Register of Marine and Nonmarine Genera
World Register of Marine Species
Freebase
GBIF Backbone Taxonomy
EOL
Passiflora vernacular names
Inventory of Fish Species in the Wami River Basin
Pheasant Diversity and Conservation in the Mt. Gaoligonshan Region
Finding Species
Birds of Lindi Forests Plantation
Nemertea
Kihansi Gorge Amphibian Species Checklist
Mushroom Observer
TaxonConcept
Amphibia and Reptilia of Yunnan
Common names of Chilean Plants
Invasive Species of Belgium
ZooKeys
COA Wildlife Conservation List
AskNature
China: Yunnan, Southern Gaoligongshan, Rapid Biological Inventories Report No. 04
Native Orchids from Gaoligongshan Mountains, China
Illinois Wildflowers
Coleorrhyncha Species File
/home/dimus/files/dwca/zoological names.zip
Peces de la zona hidrogeográfica de la Amazonia, Colombia (Spreadsheet)
Eastern Mediterranean Syllidae
Gaoligong Shan Medicinal Plants Checklist
birds_of_tanzania
AmphibiaWeb
tanzania_plant_sepecimens
Papahanaumokuakea Marine National Monument
Taiwanese IUCN species list
BioPedia
AnAge
Embioptera Species File
Global Invasive Species Database
Sendoya S., Fernández F. AAT de hormigas (Hymenoptera: Formicidae) del Neotrópico 1.0 2004 (Spreadsheet)
Flora of Gaoligong Mountains
ARKive
True Fruit Flies (Diptera, Tephritidae) of the Afrotropical Region
3i - Typhlocybinae Database
CATE Sphingidae
ZooBank
Diatoms
AntWeb
Endemic species in Taiwan
Dermaptera Species File
Mantodea Species File
Birds of the World: Recommended English Names
New Zealand Animalia
Blattodea Species File
Plecoptera Species File
/home/dimus/files/dwca/clemens.zip
Coreoidea Species File
Freshwater Animal Diversity Assessment - Normalized export
Catalogue of Vascular Plant Species of Central and Northeastern Brazil
Wikipedia in EOL
Database of Vascular Plants of Canada (VASCAN)
Phasmida Species File
OBIS
USDA NRCS PLANTS Database
Catalog of Fishes
Aphid Species File
The National Checklist of Taiwan
Psocodea Species File
FishBase
3i - Typhlocybinae Database
Belgian Species List
EUNIS
CU*STAR
Orthoptera Species File
Bishop Museum
IUCN Red List of Threatened Species
BioLib.cz
Tropicos - Missouri Botanical Garden
nlbif
The International Plant Names Index
Index to Organism Names
uBio NameBank
Arctos
Checklist of Beetles (Coleoptera) of Canada and Alaska. Second Edition.
The Paleobiology Database
The Reptile Database
The Mammal Species of The World
BirdLife International
Checklist da Flora de Portugal (Continental, Açores e Madeira)
FishBase Cache
Silva
Open Tree of Life Reference Taxonomy
iNaturalist
The Interim Register of Marine and Nonmarine Genera
Gymno
A list with resolved taxa (a tibble, from taxize::gnr_resolve) and a vector of taxa not resolved
Clean up some issues with Open Tree of Life chronograms For now it 1) checks unmapped taxa and maps them with tnrs_match.phylo, 2) roots the chronogram if unrooted
clean_ott_chronogram(phy)
clean_ott_chronogram(phy)
phy |
A |
There is no limit to the number of names that can be queried and matched.
The output will preserve all elements from original input phylo object and will add
A character vector indicating the state of mapping of phy$tip.labels:
Tnrs matching was not attempted. Original labeling is preserved.
Matching was manually made by a curator in Open Tree of Life.
Tnrs matching was attempted and successful with no approximate matching. Original label is replaced by the matched name.
Tnrs matching was attempted and successful but with approximate matching. Original labeling is preserved.
Tnrs matching was attempted and unsuccessful. Original labeling is preserved.
A character vector preserving all original labels.
A numeric vector with ott id numbers of matched tips. Unmatched and original tips will be NaN.
if tips are duplicated, tnrs will only be run once (avoiding increases in function running time) but the result will be applied to all duplicated tip labels
An object of class data frame or phylo, with the added class match_names.
NULL
NULL
taxonomy_taxon_info()
output.clean_taxon_info_children
eliminates all taxa that will give
problems when trying to retrieve an induced subtree from Open Tree of Life.
clean_taxon_info_children( taxon_info, invalid = c("barren", "extinct", "uncultured", "major_rank_conflict", "incertae_sedis", "unplaced", "conflict", "environmental", "not_otu", "hidden", "hybrid") )
clean_taxon_info_children( taxon_info, invalid = c("barren", "extinct", "uncultured", "major_rank_conflict", "incertae_sedis", "unplaced", "conflict", "environmental", "not_otu", "hidden", "hybrid") )
taxon_info |
An output of |
invalid |
A character vector of "flags", i.e., characteristics that are used by Open Tree of Life Taxonomy to detect invalid taxon names. |
A list with valid children unique OTT names, OTT ids and taxonomic ranks.
rotl::tnrs_match_names()
or tnrs_match()
output
Useful to get ott ids to retrieve an induced synthetic Open Tree of Life.
Needed because using include_suppressed = FALSE
in rotl::tnrs_match_names()
does not drop all invalid taxa.Eliminates unmatched (NAs) and invalid taxa from a rotl::tnrs_match_names()
or tnrs_match()
output
Useful to get ott ids to retrieve an induced synthetic Open Tree of Life.
Needed because using include_suppressed = FALSE
in rotl::tnrs_match_names()
does not drop all invalid taxa.
clean_tnrs( tnrs, invalid = c("barren", "extinct", "uncultured", "major_rank_conflict", "incertae", "unplaced", "conflict", "environmental", "not_otu"), remove_nonmatches = FALSE )
clean_tnrs( tnrs, invalid = c("barren", "extinct", "uncultured", "major_rank_conflict", "incertae", "unplaced", "conflict", "environmental", "not_otu"), remove_nonmatches = FALSE )
tnrs |
A data frame, usually an output from datelife::tnrs_match or rotl::tnrs_match_names functions, but see details. |
invalid |
A character string with flags to be removed from final object. |
remove_nonmatches |
Boolean, whether to remove unsuccessfully matched names or not. |
Input can be any data frame or named list that relates taxa stored in an element named "unique" to a validity category stored in "flags".
A data frame or named list (depending on the input) with valid taxa only.
Cluster a patristic matrix into a tree with various methods.
cluster_patristicmatrix(patristic_matrix, variance_matrix = NULL)
cluster_patristicmatrix(patristic_matrix, variance_matrix = NULL)
patristic_matrix |
A patristic matrix |
variance_matrix |
A variance matrix from a |
If clustering method fails, NA
is returned.
A list of trees obtained with clustering methods detailed in patristic_matrix_to_phylo()
.
Congruify and Check.
congruify_and_check( reference, target, taxonomy = NULL, tol = 0.01, option = 2, scale = "pathd8", attempt_fix = TRUE )
congruify_and_check( reference, target, taxonomy = NULL, tol = 0.01, option = 2, scale = "pathd8", attempt_fix = TRUE )
reference |
an ultrametric tree used to time-scale the |
target |
a phylogram that is sought to be ultrametricized based on the |
taxonomy |
a linkage table between tips of the phylogeny and clades represented in the tree; rownames of 'taxonomy' should be tips found in the phylogeny |
tol |
branching time in |
option |
an integer (1 or 2; see details). |
scale |
|
attempt_fix |
Default to |
congruify_and_mrca_multiPhylo
congruifies a target tree against all
source chronograms in a multiPhylo
object, and gets nodes of target tree
that correspond to the most recent common ancestor (mrca) of taxon pairs
in the congruified calibrations.
It calls congruify_and_mrca_phylo()
, and phytools::findMRCA()
to get mrca nodes.
congruify_and_mrca_multiPhylo(phy, source_chronograms)
congruify_and_mrca_multiPhylo(phy, source_chronograms)
phy |
A |
source_chronograms |
A |
a data.frame
of node ages from source_chronograms
and corresponding
mrca nodes in target tree phy
. attributes(return)$phy
stores the congruified and mrca matched phylogeny.
congruify_and_mrca
congruifies a target tree against a single
source chronogram, and gets nodes of target tree that correspond to the most
recent common ancestor (mrca) of taxon pairs from the congruified calibrations.
It uses phytools::findMRCA()
to get mrca nodes.
congruify_and_mrca_phylo(phy, source_chronogram, reference)
congruify_and_mrca_phylo(phy, source_chronogram, reference)
phy |
A |
source_chronogram |
A |
reference |
A character string indicating the study reference that the |
a data.frame
of node ages from source_chronograms
and corresponding
mrca nodes in target tree phy
.
Information on contributors, authors, study ids and clades from studies with chronograms in Open Tree of Life (Open Tree)
contributor_cache
contributor_cache
A list of five data sets.
A character vector with the author names from studies with chronograms that are in Open Tree.
A dataframe with three variables: authors, study ids and clades.
A character vector with the names of curators of chronograms that are in Open Tree.
A data.frame
with three variables: curators, study ids and clades.
A character vector with study ids whose "doi" could not be retrieved.
Generated with make_contributor_cache()
.
This will take a topology, look up information about fossils for taxa on the tree, and use paleotree::timePaleoPhy()
to compute branch lengths.
date_with_pbdb(phy, recent = FALSE, assume_recent_if_missing = TRUE)
date_with_pbdb(phy, recent = FALSE, assume_recent_if_missing = TRUE)
phy |
A |
recent |
If |
assume_recent_if_missing |
If |
A dated tree.
## Not run: # This is a flag for package development. You are welcome to run the example. taxa <- c( "Archaeopteryx", "Pinus", "Quetzalcoatlus", "Homo sapiens", "Tyrannosaurus rex", "Megatheriidae", "Metasequoia", "Aedes", "Panthera" ) phy <- tree_from_taxonomy(taxa, sources = "The Paleobiology Database")$phy ## End(Not run) # end dontrun
## Not run: # This is a flag for package development. You are welcome to run the example. taxa <- c( "Archaeopteryx", "Pinus", "Quetzalcoatlus", "Homo sapiens", "Tyrannosaurus rex", "Megatheriidae", "Metasequoia", "Aedes", "Panthera" ) phy <- tree_from_taxonomy(taxa, sources = "The Paleobiology Database")$phy ## End(Not run) # end dontrun
Return the relevant authors for a set of studies.
datelife_authors_tabulate(results.index, cache = "opentree_chronograms")
datelife_authors_tabulate(results.index, cache = "opentree_chronograms")
results.index |
A vector from |
cache |
The cached chronogram database. |
A vector with counts of each author, with names equal to author names.
datelifeResult
object.Get a median summary chronogram from a datelifeResult
object.
datelife_result_median(datelife_result, ...)
datelife_result_median(datelife_result, ...)
datelife_result |
A |
... |
Arguments passed on to
|
A phylo
object.
datelifeResult
object.Compute a median matrix of a datelifeResult
object.
datelife_result_median_matrix(datelife_result)
datelife_result_median_matrix(datelife_result)
datelife_result |
A |
A patristic distance summary matrix from a datelifeResult
object.
datelifeResult
object. Used in summarize_datelife_result()
.Get a numeric vector of MRCAs from a datelifeResult
object. Used in summarize_datelife_result()
.
datelife_result_MRCA(datelife_result, na_rm = TRUE)
datelife_result_MRCA(datelife_result, na_rm = TRUE)
datelife_result |
A |
na_rm |
If |
A named numeric vector of MRCA ages for each element given in datelife_result
.
datelifeResult
object to a Super Distance Matrix (SDM) using weighting = "flat"Go from a datelifeResult
object to a Super Distance Matrix (SDM) using weighting = "flat"
datelife_result_sdm_matrix(datelife_result)
datelife_result_sdm_matrix(datelife_result)
datelife_result |
A |
A numeric matrix.
datelifeResult
object using the Super Distance Matrix (SDM) method.Reconstruct a supertree from a datelifeResult
object using the Super Distance Matrix (SDM) method.
datelife_result_sdm_phylo(datelife_result, weighting = "flat", ...)
datelife_result_sdm_phylo(datelife_result, weighting = "flat", ...)
datelife_result |
A |
weighting |
A character vector indicating how much weight to give to each
tree in
Defaults to |
... |
Arguments passed on to
|
Chronograms given as input in datelife_result
are summarized with the Super Distance
Matrix (SDM) method described in Criscuolo et al. (2006) doi:10.1080/10635150600969872,
implemented with the function ape::SDM()
. The resulting summary SDM is
clustered with summary_matrix_to_phylo()
.
A supertree with branch lengths proportional to time, obtained by
summarizing individual chronograms given as input in datelife_result
.
It is returned as an object of class datelifeSDM
, which is a phylo
object
with an additional $data
element storing the input chronograms as a
datelifeResult
object, and a $citation
element containing
citations of studies from input chronograms.
Criscuolo A, Berry V, Douzery EJ, Gascuel O. (2006) "SDM: a fast distance-based approach for (super) tree building in phylogenomics" doi:10.1080/10635150600969872.
datelife_result_study_index
is used in summarize_datelife_result()
.
datelife_result_study_index(datelife_result, cache = "opentree_chronograms")
datelife_result_study_index(datelife_result, cache = "opentree_chronograms")
datelife_result |
A |
cache |
The cached chronogram database. |
A vector of indices of studies that have relevant information.
datelifeResult
object.Compute a variance matrix of a datelifeResult
object.
datelife_result_variance_matrix(datelife_result)
datelife_result_variance_matrix(datelife_result)
datelife_result |
A |
A variance matrix from a datelifeResult
object.
datelife_search
is the core DateLife function to find and
get all openly available, peer-reviewed scientific information on time of
lineage divergence for a set of input
taxon names given as a character
vector, a newick character string, a phylo
or multiPhylo
object or as a
an already processed datelifeQuery
object obtained with make_datelife_query()
.
datelife_search( input = c("Rhea americana", "Pterocnemia pennata", "Struthio camelus"), use_tnrs = FALSE, get_spp_from_taxon = FALSE, partial = TRUE, cache = "opentree_chronograms", summary_format = "phylo_all", na_rm = FALSE, summary_print = c("citations", "taxa"), taxon_summary = c("none", "summary", "matrix"), criterion = "taxa" )
datelife_search( input = c("Rhea americana", "Pterocnemia pennata", "Struthio camelus"), use_tnrs = FALSE, get_spp_from_taxon = FALSE, partial = TRUE, cache = "opentree_chronograms", summary_format = "phylo_all", na_rm = FALSE, summary_print = c("citations", "taxa"), taxon_summary = c("none", "summary", "matrix"), criterion = "taxa" )
input |
One of the following:
|
use_tnrs |
Whether to use Open Tree of Life's Taxonomic Name Resolution Service (TNRS)
to process input taxon names. Default to |
get_spp_from_taxon |
Whether to search ages for all species belonging to a
given taxon or not. Default to |
partial |
Whether to return or exclude partially matching source chronograms,
i.e, those that match some and not all of taxa given in |
cache |
A character vector of length one, with the name of the data object
to cache. Default to |
summary_format |
A character vector of length one, indicating the output format for results of the DateLife search. Available output formats are:
|
na_rm |
If |
summary_print |
A character vector specifying the type of summary information to be printed to screen. Options are:
Defaults to |
taxon_summary |
A character vector specifying if data on target taxa missing
in source chronograms should be added to the output as a |
criterion |
Defaults to |
If only one taxon name is given as input
, get_spp_from_taxon
is
always set to TRUE
.
The output is determined by the argument summary_format
:
summary_format = "citations"
The function returns a character vector of references.
summary_format = "mrca"
The function returns a named numeric vector of most recent common ancestor (mrca) ages.
summary_format = "newick_[all, sdm, or median]"
The function returns output chronograms as newick strings.
summary_format = "phylo_[all, sdm, median, or biggest]"
The
function returns output chronograms as phylo
or multiPhylo
objects.
summary_format = "html" or "data_frame"
The function returns a 4 column table with data on mrca ages, number of taxa, references, and output chronograms as newick strings.
## Not run: # For this example, we will set a temp working directory, but you can set # your working directory as needed: # we will use the tempdir() function to get a temporary directory: tempwd <- tempdir() # Obtain median ages from a set of source chronograms in newick format: ages <- datelife_search(c( "Rhea americana", "Pterocnemia pennata", "Struthio camelus", "Mus musculus" ), summary_format = "newick_median") # Save the tree in the temp working directory in newick format: write(ages, file = file.path(tempwd, "some.bird.ages.txt")) # Obtain median ages from a set of source chronograms in phylo format # Will produce same tree as above but in "phylo" format: ages.again <- datelife_search(c( "Rhea americana", "Pterocnemia pennata", "Struthio camelus", "Mus musculus" ), summary_format = "phylo_median") plot(ages.again) library(ape) ape::axisPhylo() mtext("Time (million years ago)", side = 1, line = 2, at = (max(get("last_plot.phylo", envir = .PlotPhyloEnv )$xx) * 0.5)) # Save "phylo" object in newick format write.tree(ages.again, file = file.path(tempwd, "some.bird.tree.again.txt")) # Obtain MRCA ages and target chronograms from all source chronograms # Generate an htm"l output readable in any web browser: ages.html <- datelife_search(c( "Rhea americana", "Pterocnemia pennata", "Struthio camelus", "Mus musculus" ), summary_format = "html") write(ages.html, file = file.path(tempwd, "some.bird.trees.html")) system(paste("open", file.path(tempwd, "some.bird.trees.html"))) ## End(Not run) # end dontrun
## Not run: # For this example, we will set a temp working directory, but you can set # your working directory as needed: # we will use the tempdir() function to get a temporary directory: tempwd <- tempdir() # Obtain median ages from a set of source chronograms in newick format: ages <- datelife_search(c( "Rhea americana", "Pterocnemia pennata", "Struthio camelus", "Mus musculus" ), summary_format = "newick_median") # Save the tree in the temp working directory in newick format: write(ages, file = file.path(tempwd, "some.bird.ages.txt")) # Obtain median ages from a set of source chronograms in phylo format # Will produce same tree as above but in "phylo" format: ages.again <- datelife_search(c( "Rhea americana", "Pterocnemia pennata", "Struthio camelus", "Mus musculus" ), summary_format = "phylo_median") plot(ages.again) library(ape) ape::axisPhylo() mtext("Time (million years ago)", side = 1, line = 2, at = (max(get("last_plot.phylo", envir = .PlotPhyloEnv )$xx) * 0.5)) # Save "phylo" object in newick format write.tree(ages.again, file = file.path(tempwd, "some.bird.tree.again.txt")) # Obtain MRCA ages and target chronograms from all source chronograms # Generate an htm"l output readable in any web browser: ages.html <- datelife_search(c( "Rhea americana", "Pterocnemia pennata", "Struthio camelus", "Mus musculus" ), summary_format = "html") write(ages.html, file = file.path(tempwd, "some.bird.trees.html")) system(paste("open", file.path(tempwd, "some.bird.trees.html"))) ## End(Not run) # end dontrun
datelife_use
gets secondary calibrations available for any
pair of given taxon names, mined from the opentree_chronograms object,
congruifies them, and uses them to date a given tree topology with the
algorithm defined in dating_method
. If no tree topology is provided,
it will attempt to get one for the given taxon names from Open Tree of Life
synthetic tree, using make_bold_otol_tree()
.
datelife_use(input = NULL, each = FALSE, dating_method = "bladj", ...)
datelife_use(input = NULL, each = FALSE, dating_method = "bladj", ...)
input |
One of the following:
|
each |
Boolean, default to |
dating_method |
Tree dating algorithm to use. Options are "bladj" or "pathd8" (Webb et al., 2008, doi:10.1093/bioinformatics/btn358; Britton et al., 2007, doi:10.1080/10635150701613783). |
... |
Arguments passed on to
|
If input
is a vector of taxon names, the function will attempt to reconstruct a BOLD
tree with make_bold_otol_tree()
to get a tree with branch lengths. If it fails,
it will get an Open Tree of Life synthetic tree topology.
The function then calls use_calibrations()
.
A phylo
or multiPhylo
object with branch lengths proportional to time.
The output object stores the used calibrations
and dating_method
as
attributes(output)$datelife_calibrations
and attributes(output)$dating_method
.
datelifeQuery
object.datelife_use
gets secondary calibrations available for any
pair of given taxon names, mined from the opentree_chronograms object,
congruifies them, and uses them to date a given tree topology with the
algorithm defined in dating_method
. If no tree topology is provided,
it will attempt to get one for the given taxon names from Open Tree of Life
synthetic tree, using make_bold_otol_tree()
.
datelife_use_datelifequery( datelife_query = NULL, dating_method = "bladj", each = FALSE )
datelife_use_datelifequery( datelife_query = NULL, dating_method = "bladj", each = FALSE )
datelife_query |
A |
dating_method |
Tree dating algorithm to use. Options are "bladj" or "pathd8" (Webb et al., 2008, doi:10.1093/bioinformatics/btn358; Britton et al., 2007, doi:10.1080/10635150701613783). |
each |
Boolean, default to |
If phy
has no branch lengths, dating_method
is ignores, and the function applies secondary
calibrations to date the tree with the BLADJ algorithm. See make_bladj_tree()
and use_calibrations_bladj()
.
If phy
has branch lengths, the function can use the PATHd8 algorithm. See use_calibrations_pathd8()
.
A phylo
or multiPhylo
object with branch lengths proportional to time.
The output object stores the used calibrations
and dating_method
as
attributes(output)$datelife_calibrations
and attributes(output)$dating_method
.
datelifeResult
object.This function extracts node ages for each taxon
pair given in input$tip.labels
. It applies the congruification method
described in Eastman et al. (2013) doi:10.1111/2041-210X.12051,
implemented with the function geiger::congruify.phylo()
, to create a
data.frame
of taxon pair node ages that can be used as secondary calibrations.
extract_calibrations_dateliferesult(input = NULL, each = FALSE)
extract_calibrations_dateliferesult(input = NULL, each = FALSE)
input |
A |
each |
Boolean, default to |
The function takes a datelifeResult
object and calls
summarize_datelife_result()
with summary_format = "phylo_all". This goes from a
datelifeResultobject to a
phyloor
multiPhylo' object that is
passed to extract_calibrations_phylo()
.
An object of class calibrations
, which is a data.frame
(if
each = FALSE
) or a list of data.frames
(if each = TRUE
) of node
ages for each pair of taxon names. You can access the input
data from which
the calibrations were extracted with attributes(output)$chronograms.
phylo
or multiPhylo
object with branch lengths proportional to time.This function extracts node ages for each taxon
pair given in input$tip.labels
. It applies the congruification method
described in Eastman et al. (2013) doi:10.1111/2041-210X.12051,
implemented with the function geiger::congruify.phylo()
, to create a
data.frame
of taxon pair node ages that can be used as secondary calibrations.
extract_calibrations_phylo(input = NULL, each = FALSE)
extract_calibrations_phylo(input = NULL, each = FALSE)
input |
A |
each |
Boolean, default to |
An object of class calibrations
, which is a data.frame
(if
each = FALSE
) or a list of data.frames
(if each = TRUE
) of node
ages for each pair of taxon names. You can access the input
data from which
the calibrations were extracted with attributes(output)$chronograms.
Eastman et al. (2013) "Congruification: support for time scaling large phylogenetic trees". Methods in Ecology and Evolution, 4(7), 688-691, doi:10.1111/2041-210X.12051.
Extract numeric OTT ids from a character vector that combines taxon names and OTT ids.
extract_ott_ids(x, na.rm = TRUE) ## Default S3 method: extract_ott_ids(x, na.rm = TRUE)
extract_ott_ids(x, na.rm = TRUE) ## Default S3 method: extract_ott_ids(x, na.rm = TRUE)
x |
A character vector of taxon names, or a phylo object with tree tip labels containing OTT ids. |
na.rm |
A logical value indicating whether |
An object of class numeric containing OTT ids only.
NULL
## Not run: # This is a flag for package development. You are welcome to run the example. canis <- rotl::tnrs_match_names("canis") canis_taxonomy <- rotl::taxonomy_subtree(canis$ott_id) my_ott_ids <- extract_ott_ids(x = canis_taxonomy$tip_label) # Get the problematic elements from input canis_taxonomy$tip_label[attr(my_ott_ids, "na.action")] ## End(Not run) # end dontrun
## Not run: # This is a flag for package development. You are welcome to run the example. canis <- rotl::tnrs_match_names("canis") canis_taxonomy <- rotl::taxonomy_subtree(canis$ott_id) my_ott_ids <- extract_ott_ids(x = canis_taxonomy$tip_label) # Get the problematic elements from input canis_taxonomy$tip_label[attr(my_ott_ids, "na.action")] ## End(Not run) # end dontrun
datelifeSummary of a datelifeResult object of all Felidae species.
felid_gdr_phylo_all
felid_gdr_phylo_all
A list of three elements, containing the summary of a datelifeResult object
List of subset chronograms in phylo format
A data frame with taxon presence across subset chronograms
A dataframe with names of taxon not found in any chronogram
Generated with: felid_spp <- make_datelife_query(input = "felidae", get_spp_from_taxon = TRUE) felid_gdr <- get_datelife_result(input = felid_spp, get_spp_from_taxon = TRUE) felid_gdr_phylo_all <- summarize_datelife_result(datelife_result = felid_gdr, taxon_summary = "summary", summary_format = "phylo_all", datelife_query = felid_spp) usethis::use_data(felid_gdr_phylo_all)
SDM tree of a datelifeResult object of all Felidae species.
felid_sdm
felid_sdm
A list of two elements, containing the summary of a datelifeResult object
An ultrametric phylo object with the SDM tree.
A datelifeResult object with data used to construct phy
Generated with: felid_spp <- make_datelife_query(input = "felidae", get_spp_from_taxon = TRUE) felid_gdr <- get_datelife_result(input = felid_spp, get_spp_from_taxon = TRUE) felid_sdm <- datelife_result_sdm_phylo(felid_gdr) usethis::use_data(felid_sdm)
datelifeResult
object to find the largest grove.Filter a datelifeResult
object to find the largest grove.
filter_for_grove(datelife_result, criterion = "taxa", n = 2)
filter_for_grove(datelife_result, criterion = "taxa", n = 2)
datelife_result |
A |
criterion |
Defaults to |
n |
The degree of taxon name overlap among input chronograms. Defaults
to |
A datelifeResult
object filtered to only include one grove of trees.
phylo
object to be ultrametric with phytools::force.ultrametric()
.Force a non-ultrametric phylo
object to be ultrametric with phytools::force.ultrametric()
.
force_ultrametric(phy)
force_ultrametric(phy)
phy |
A |
A phylo
object.
get_all_calibrations
performs a datelife_search()
and gets divergence times (i.e., secondary calibrations) from a chronogram
database for each taxon name pair given as input
.
get_all_calibrations
performs a datelife_search()
and gets divergence times (i.e., secondary calibrations) from a chronogram
database for each taxon name pair given as input
.
get_all_calibrations(input = NULL, each = FALSE) get_all_calibrations(input = NULL, each = FALSE)
get_all_calibrations(input = NULL, each = FALSE) get_all_calibrations(input = NULL, each = FALSE)
input |
One of the following:
|
each |
Boolean, default to |
An object of class calibrations
, which is a data.frame
(if
each = FALSE
) or a list of data.frames
(if each = TRUE
) of node
ages for each pair of taxon names. You can access the input
data from which
the calibrations were extracted with attributes(output)$chronograms.
This is less thorough than get_open_tree_species(), but much faster. It uses the fact that something has just two names (genus and species) to assume that something is a single species; if it has more than two names, it is assumed to be a subspecies so it goes up one level in the hierarchy. It will return the subspecies and the species.
get_all_descendant_species(taxon_name, ott_id)
get_all_descendant_species(taxon_name, ott_id)
taxon_name |
A character vector providing an inclusive taxonomic name. |
ott_id |
A numeric vector providig an Open Tree Taxonomic id number for
a taxonomic name. If provided, |
A list of unique OTT names and OTT ids of species within the provided taxon.
datelifeResult
object that can be converted to phylo
from a median summary matrixGet grove from a datelifeResult
object that can be converted to phylo
from a median summary matrix
get_best_grove(datelife_result, criterion = "taxa", n = 2)
get_best_grove(datelife_result, criterion = "taxa", n = 2)
datelife_result |
A |
criterion |
Defaults to |
n |
The degree of taxon name overlap among input chronograms. Defaults
to |
A list of two elements:
A datelifeResult
object filtered to only include one grove of trees that can be summarized with median or sdm.
The degree of taxon names overlap among trees in the best grove.
Get the tree with the most tips from a multiPhylo object: the biggest tree.
get_biggest_multiphylo(trees)
get_biggest_multiphylo(trees)
trees |
A list of trees as |
The largest tree from those given in trees
, as a phylo
object with an additional $citation
element containing the reference of the original publication.
get_bold_data
uses taxon names from a tree topology, a character
vector of names or a datelifeQuery
object, to search for genetic markers
in the Barcode of Life Database (BOLD).
get_bold_data( input = c("Rhea americana", "Struthio camelus", "Gallus gallus"), marker = "COI", ... )
get_bold_data( input = c("Rhea americana", "Struthio camelus", "Gallus gallus"), marker = "COI", ... )
input |
One of the following:
|
marker |
A character vector indicating the gene from BOLD system to be used for branch length estimation. It searches "COI" marker by default. |
... |
Arguments passed on to
|
If input
is a phylo
object or a newick string, it is used as backbone topology.
If input
is a character vector of taxon names, an induced synthetic OpenTree
subtree is used as backbone.
A phylo
object. If there are enough BOLD sequences available for the
input
taxon names, the function returns a tree with branch lengths proportional
to relative substitution rate. If not enough BOLD sequences are available
for the input
taxon names, the function returns the topology given as
input
, or a synthetic Open Tree of Life for the taxon names given in
input
, obtained with get_otol_synthetic_tree()
.
datelifeQuery
objectThe function searches DateLife's local
database of phylogenetic trees with branch lengths proportional to time
(chronograms) with datelife_search()
, and extracts available node ages
for each pair of given taxon names with extract_calibrations_phylo()
.
get_calibrations_datelifequery(datelife_query = NULL, each = FALSE)
get_calibrations_datelifequery(datelife_query = NULL, each = FALSE)
datelife_query |
A |
each |
Boolean, default to |
The function calls datelife_search()
with summary_format = "phylo_all"
to get all chronograms in the database
containing at least two taxa in input
, and generates a phylo
or multiPhylo
object object that will be passed to
extract_calibrations_phylo()
.
An object of class calibrations
, which is a data.frame
(if
each = FALSE
) or a list of data.frames
(if each = TRUE
) of node
ages for each pair of taxon names. You can access the input
data from which
the calibrations were extracted with attributes(output)$chronograms.
The function searches DateLife's local
database of phylogenetic trees with branch lengths proportional to time
(chronograms) with datelife_search()
, and extracts available node ages
for each pair of given taxon names with extract_calibrations_phylo()
.
get_calibrations_vector(input = NULL, each = FALSE)
get_calibrations_vector(input = NULL, each = FALSE)
input |
A character vector of taxon names. |
each |
Boolean, default to |
The function calls datelife_search()
with summary_format = "phylo_all"
to get all chronograms in the database
containing at least two taxa in input
, and generates a phylo
or multiPhylo
object object that will be passed to
extract_calibrations_phylo()
.
An object of class calibrations
, which is a data.frame
(if
each = FALSE
) or a list of data.frames
(if each = TRUE
) of node
ages for each pair of taxon names. You can access the input
data from which
the calibrations were extracted with attributes(output)$chronograms.
Get a dated OpenTree induced synthetic subtree from a set of given taxon names, from blackrim's FePhyFoFum service.
get_dated_otol_induced_subtree(input = NULL, ott_ids = NULL, ...)
get_dated_otol_induced_subtree(input = NULL, ott_ids = NULL, ...)
input |
Optional. A character vector of names or a |
ott_ids |
If not NULL, it takes this argument and ignores input. A
numeric vector of ott ids obtained with |
... |
Arguments passed on to |
OpenTree dated tree from Stephen Smith's OpenTree scaling service at
https://github.com/FePhyFoFum/gophy if you want to make an LTT plot of
a dated OpenTree tree you'll need to get rid of singleton nodes with
ape::collapse.singles()
and also probably do phytools::force.ultrametric()
.
A phylo object with edge length proportional to time in Myrs. It will return NA if any ott_id is invalid.
get_datelife_result
takes as input a vector of taxon names, a newick string,
a phylo
object, or adatelifeQuery
object. It searches the chronogram
database specified in cache
for chronograms matching two or more given
taxon names. For each matching chronogram, it extracts time of lineage
divergence data and stores it as a patristic matrix. It then lists all
resulting patristic matrices. Each list element is named with the study
citation of the source chronogram.
get_datelife_result( input = NULL, partial = TRUE, cache = "opentree_chronograms", update_opentree_chronograms = FALSE, ... )
get_datelife_result( input = NULL, partial = TRUE, cache = "opentree_chronograms", update_opentree_chronograms = FALSE, ... )
input |
One of the following:
|
partial |
Whether to return or exclude partially matching source chronograms,
i.e, those that match some and not all of taxa given in |
cache |
A character vector of length one, with the name of the data object
to cache. Default to |
update_opentree_chronograms |
Whether to update the chronogram database or not.
Defaults to |
... |
Arguments passed on to
|
A datelifeResult
object – a named list of patristic matrices.
datelifeQuery
objectGet a list of patristic matrices from a given datelifeQuery
object
get_datelife_result_datelifequery( datelife_query = NULL, partial = TRUE, cache = "opentree_chronograms", update_opentree_chronograms = FALSE, ... )
get_datelife_result_datelifequery( datelife_query = NULL, partial = TRUE, cache = "opentree_chronograms", update_opentree_chronograms = FALSE, ... )
datelife_query |
A |
partial |
Whether to return or exclude partially matching source chronograms,
i.e, those that match some and not all of taxa given in |
cache |
A character vector of length one, with the name of the data object
to cache. Default to |
update_opentree_chronograms |
Whether to update the chronogram database or not.
Defaults to |
... |
Arguments passed on to
|
If there is just one taxon name in input$cleaned_names
, the
function will run make_datelife_query()
setting get_spp_from_taxon = TRUE
.
The datelifeQuery
used as input can be accessed with attributes(datelifeResult)$query
.
A datelifeResult
object – a named list of patristic matrices.
This uses the Paleobiology Database's API to gather information on the ages for all specimens of a taxon. It will also look for all descendants of the taxon. It fixes name misspellings if possible.
get_fossil_range(taxon, recent = FALSE, assume_recent_if_missing = TRUE)
get_fossil_range(taxon, recent = FALSE, assume_recent_if_missing = TRUE)
taxon |
The scientific name of the taxon you want the range of occurrences of |
recent |
If TRUE, forces the minimum age to be zero |
assume_recent_if_missing |
If TRUE, any taxon missing from pbdb is assumed to be recent |
a data.frame of max_ma and min_ma for the specimens
make_sdm()
.Get indices of good matrices to apply Super Distance Matrix (SDM) method with make_sdm()
.
get_goodmatrices(unpadded.matrices)
get_goodmatrices(unpadded.matrices)
unpadded.matrices |
A list of patristic matrices, a |
A numeric vector of good matrix indices in unpadded.matrices.
Makes a block of node constraints and node calibrations for a MrBayes run file from a list of taxa and ages, or from a dated tree
get_mrbayes_node_constraints( constraint = NULL, taxa = NULL, missing_taxa = NULL, ncalibration = NULL, age_distribution = "fixed", root_calibration = FALSE, mrbayes_constraints_file = NULL, clockratepr = "prset clockratepr = fixed(1);" )
get_mrbayes_node_constraints( constraint = NULL, taxa = NULL, missing_taxa = NULL, ncalibration = NULL, age_distribution = "fixed", root_calibration = FALSE, mrbayes_constraints_file = NULL, clockratepr = "prset clockratepr = fixed(1);" )
constraint |
The constraint tree: a phylo object or a newick character string, with or without branch lengths. |
taxa |
A character vector with taxon names to be maintained in tree |
missing_taxa |
A tree, a data frame or a vector enlisting all missing taxa you want to include.
|
ncalibration |
The node calibrations: a phylo object with branch lengths proportional to time; in this case all nodes from ncalibration will be used as calibration points. Alternatively, a list with two elements: the first is a character vector with node names from phy to calibrate; the second is a numeric vector with the corresponding ages to use as calibrations. |
age_distribution |
A character string specifying the type of calibration. Only "fixed" and "uniform" are implemented for now.
|
root_calibration |
Used to set a calibration at the root or not. Default to FALSE. Only relevant if ncalibration is specified. |
mrbayes_constraints_file |
NULL or a character vector indicating the name of mrbayes constraint and/or calibration block file. |
clockratepr |
A character vector indicating the clockrateprior to be used. |
A set of MrBayes constraints and/or calibration commands printed in console as character strings or as a text file specified in mrbayes_constraints_file.
Get all chronograms from Open Tree of Life database using direct call from Open Tree API
get_opentree_chronograms(max_tree_count = "all") get_otol_chronograms(max_tree_count = "all")
get_opentree_chronograms(max_tree_count = "all") get_otol_chronograms(max_tree_count = "all")
max_tree_count |
Default to "all", it gets all available chronograms. For testing purposes, a numeric value indicating the max number of trees to be cached. |
A list of 4 elements:
A list of lists of author names of the original studies that published chronograms currently stored in the Open Tree of Life database.
A list of lists of curator names that uploaded chronograms to the Open Tree of Life database.
A list of study identifiers from original studies that published chronograms currently stored in the Open Tree of Life database.
A multiPhylo
object storing the chronograms from Open Tree of
Life database.
A character vector indicating the time when the database object was last updated.
A character vector indicating the datelife package version when the object was last updated.
Get all chronograms from Open Tree of Life database
get_opentree_chronograms_slow(max_tree_count = "all")
get_opentree_chronograms_slow(max_tree_count = "all")
max_tree_count |
Default to "all", it gets all available chronograms. For testing purposes, a numeric value indicating the max number of trees to be cached. |
A list of 4 elements:
A list of lists of author names of the original studies that published chronograms currently stored in the Open Tree of Life database.
A list of lists of curator names that uploaded chronograms to the Open Tree of Life database.
A list of study identifiers from original studies that published chronograms currently stored in the Open Tree of Life database.
A multiPhylo
object storing the chronograms from Open Tree of
Life database.
A character vector indicating the time when the database object was last updated.
A character vector indicating the datelife package version when the object was last updated.
Get all species belonging to a taxon from the Open Tree of Life Taxonomy (OTT)
get_opentree_species(taxon_name, ott_id, synth_tree_only = TRUE)
get_opentree_species(taxon_name, ott_id, synth_tree_only = TRUE)
taxon_name |
A character vector providing an inclusive taxonomic name. |
ott_id |
A numeric vector providig an Open Tree Taxonomic id number for
a taxonomic name. If provided, |
synth_tree_only |
Whether to include species that are in the synthetic Open
Tree of Life only or not. Default to |
A list of unique OTT names and OTT ids of species within the provided taxon.
Get an Open Tree of Life synthetic subtree of a set of given taxon names.
get_otol_synthetic_tree( input = NULL, ott_ids = NULL, otol_version = "v3", resolve = FALSE, ... )
get_otol_synthetic_tree( input = NULL, ott_ids = NULL, otol_version = "v3", resolve = FALSE, ... )
input |
Optional. A character vector of names or a |
ott_ids |
If not NULL, it takes this argument and ignores input. A
numeric vector of ott ids obtained with |
otol_version |
Version of Open Tree of Life to use |
resolve |
Defaults to |
... |
Arguments passed on to |
A phylo object
rotl::tol_subtree()
when taxa are not in synthesis tree and
you still need to get all species or an induced OpenTree subtreeUse this instead of rotl::tol_subtree()
when taxa are not in synthesis tree and
you still need to get all species or an induced OpenTree subtree
get_ott_children(input = NULL, ott_ids = NULL, ott_rank = "species", ...)
get_ott_children(input = NULL, ott_ids = NULL, ott_rank = "species", ...)
input |
Optional. A character vector of names or a |
ott_ids |
If not NULL, it takes this argument and ignores input. A
numeric vector of ott ids obtained with |
ott_rank |
A character vector with the ranks you wanna get lineage children from. |
... |
Other arguments to pass to |
A data.frame
object.
# An example with the dog genus: # It is currently not possible to get an OpenTree subtree of a taxon that is # missing from the OpenTree synthetic tree. # The dog genus is not monophyletic in the OpenTree synthetic tree, so in # practice, it has no node to extract a subtree from. tnrs <- tnrs_match("Canis") ## Not run: # This is a flag for package development. You are welcome to run the example. rotl::tol_subtree(tnrs$ott_id[1]) #> Error: HTTP failure: 400 #> [/v3/tree_of_life/subtree] Error: node_id was not found (broken taxon). ## End(Not run) # end dontrun ids <- tnrs$ott_id[1] names(ids) <- tnrs$unique_name children <- get_ott_children(ott_ids = ids) # or children <- get_ott_children(input = "Canis") if (!is.na(children)) { str(children) ids <- children$Canis$ott_id names(ids) <- rownames(children$Canis) tree_children <- datelife::get_otol_synthetic_tree(ott_ids = ids) plot(tree_children, cex = 0.3) } # An example with flowering plants: ## Not run: # This is a flag for package development. You are welcome to run the example. oo <- get_ott_children(input = "magnoliophyta", ott_rank = "order") # Get the number of orders of flowering plants that we have sum(oo$Magnoliophyta$rank == "order") ## End(Not run) # end dontrun
# An example with the dog genus: # It is currently not possible to get an OpenTree subtree of a taxon that is # missing from the OpenTree synthetic tree. # The dog genus is not monophyletic in the OpenTree synthetic tree, so in # practice, it has no node to extract a subtree from. tnrs <- tnrs_match("Canis") ## Not run: # This is a flag for package development. You are welcome to run the example. rotl::tol_subtree(tnrs$ott_id[1]) #> Error: HTTP failure: 400 #> [/v3/tree_of_life/subtree] Error: node_id was not found (broken taxon). ## End(Not run) # end dontrun ids <- tnrs$ott_id[1] names(ids) <- tnrs$unique_name children <- get_ott_children(ott_ids = ids) # or children <- get_ott_children(input = "Canis") if (!is.na(children)) { str(children) ids <- children$Canis$ott_id names(ids) <- rownames(children$Canis) tree_children <- datelife::get_otol_synthetic_tree(ott_ids = ids) plot(tree_children, cex = 0.3) } # An example with flowering plants: ## Not run: # This is a flag for package development. You are welcome to run the example. oo <- get_ott_children(input = "magnoliophyta", ott_rank = "order") # Get the number of orders of flowering plants that we have sum(oo$Magnoliophyta$rank == "order") ## End(Not run) # end dontrun
Get the Open Tree of Life Taxonomic identifiers (OTT ids) and name of one or several given taxonomic ranks from one or more input taxa.
get_ott_clade(input = NULL, ott_ids = NULL, ott_rank = "family")
get_ott_clade(input = NULL, ott_ids = NULL, ott_rank = "family")
input |
Optional. A character vector of names or a |
ott_ids |
If not NULL, it takes this argument and ignores input. A
numeric vector of ott ids obtained with |
ott_rank |
A character vector with the ranks you wanna get lineage children from. |
A list of named numeric vectors with OTT ids from input and all requested ranks.
Get the Open Tree of Life Taxonomic identifier (OTT id) and name of all lineages from one or more input taxa.
get_ott_lineage(input = NULL, ott_ids = NULL)
get_ott_lineage(input = NULL, ott_ids = NULL)
input |
Optional. A character vector of names or a |
ott_ids |
If not NULL, it takes this argument and ignores input. A
numeric vector of ott ids obtained with |
A list of named numeric vectors of ott ids from input and all the clades it belongs to.
## Not run: # This is a flag for package development. You are welcome to run the example. taxa <- c("Homo", "Bacillus anthracis", "Apis", "Salvia") lin <- get_ott_lineage(taxa) lin # Look up an unknown OTT id: get_ott_lineage(ott_id = 454749) ## End(Not run) # end dontrun
## Not run: # This is a flag for package development. You are welcome to run the example. taxa <- c("Homo", "Bacillus anthracis", "Apis", "Salvia") lin <- get_ott_lineage(taxa) lin # Look up an unknown OTT id: get_ott_lineage(ott_id = 454749) ## End(Not run) # end dontrun
get_subset_array_dispatch
is used inside get_datelife_result()
get_subset_array_dispatch( study_element, taxa, phy = NULL, phy4 = NULL, dating_method = "PATHd8" )
get_subset_array_dispatch( study_element, taxa, phy = NULL, phy4 = NULL, dating_method = "PATHd8" )
study_element |
The thing being passed in: an |
taxa |
Vector of taxon names to get a subset for. |
phy |
A user tree to congruify as |
phy4 |
A user tree to congruify in |
dating_method |
The method used for tree dating. |
A patristic matrix with ages for the target taxa.
datelifeResult
object.Get a taxon summary of a datelifeResult
object.
get_taxon_summary(datelife_result = NULL, datelife_query = NULL)
get_taxon_summary(datelife_result = NULL, datelife_query = NULL)
datelife_result |
A |
datelife_query |
A |
A datelifeTaxonSummary
object, which is a list of 4 elements:
Data as a presence/absence matrix of taxon names across chronograms.
A data.frame
with taxon names as row.names()
and two
columns, one with the number of chronograms that contain a taxon name and
the other one with the total number of chronograms that have at least 2
taxon names.
A data.frame
with chronogram citations as row.names()
and two columns, one with the number of taxon names found in each chronogram
and the other one with the total number of taxon names.
A character vector of taxon names that are not found in the chronogram database.
make_datelife_query2
always uses TNRS (Taxonomic Name Resolution Service
to process input taxon names, to correct misspellings and
taxonomic name variations with tnrs_match()
, a wrapper of rotl::tnrs_match_names()
).
get_tnrs_names( input = c("Rhea americana", "Pterocnemia pennata", "Struthio camelus"), reference_taxonomy = "ott", ... )
get_tnrs_names( input = c("Rhea americana", "Pterocnemia pennata", "Struthio camelus"), reference_taxonomy = "ott", ... )
input |
Taxon names as a character vector of taxon names. Two or more
names can be provided as a single comma separated string or concatenated with |
reference_taxonomy |
A character vector specifying the reference taxonomy to use for TNRS. Options are "ott", "ncbi", "gbif" or "irmng". The function defaults to "ott". |
... |
Arguments passed on to
|
A datelifeTNRS
object, which is a list of three elements:
A character vector of names provided as input.
A character vector of taxon names processed with TNRS.
A numeric vector of Open Tree of Life Taxonomy (OTT) ids.
Extract valid children from given taxonomic name(s) or Open Tree of Life Taxonomic identifiers (OTT ids) from a taxonomic source.
get_valid_children(input = NULL, ott_ids = NULL, reference_taxonomy = "ncbi")
get_valid_children(input = NULL, ott_ids = NULL, reference_taxonomy = "ncbi")
input |
Optional. A character vector of names or a |
ott_ids |
If not NULL, it takes this argument and ignores input. A
numeric vector of ott ids obtained with |
reference_taxonomy |
A character vector with the desired taxonomic sources. Options are "ncbi", "gbif" or "irmng". Any other value will retrieve data from all taxonomic sources. The function defaults to "ncbi". |
GBIF and other taxonomies contain deprecated taxa that are not marked as such in the Open Tree of Life Taxonomy. We are relying mainly in the NCBI taxonomy for now.
A named list containing valid taxonomic children of given taxonomic name(s).
# genus Dictyophyllidites with ott id = 6003921 has only extinct children # in cases like this the same name will be returned tti <- rotl::taxonomy_taxon_info(6003921, include_children = TRUE) gvc <- get_valid_children(ott_ids = 6003921) # More examples: get_valid_children(ott_ids = 769681) # Psilotopsida get_valid_children(ott_ids = 56601) # Marchantiophyta
# genus Dictyophyllidites with ott id = 6003921 has only extinct children # in cases like this the same name will be returned tti <- rotl::taxonomy_taxon_info(6003921, include_children = TRUE) gvc <- get_valid_children(ott_ids = 6003921) # More examples: get_valid_children(ott_ids = 769681) # Psilotopsida get_valid_children(ott_ids = 56601) # Marchantiophyta
Process a phylo object or a character string to determine if it's correct newick
input_process(input)
input_process(input)
input |
Taxon names as one of the following:
|
A phylo
object or NA
if input is not a tree .
datelifeQuery
objectis_datelife_query
checks for two things to be TRUE
or FALSE
.
First, that input
is of class datelifeQuery.
Second, that input
is a list that contains at least two elements of a datelifeQuery
object:
A character vector of taxon names.
Either NA or a phylo
object.
is_datelife_query(input)
is_datelife_query(input)
input |
An object to be checked as an object with essential properties of a 'datelifeQuery' object. |
If the object has the correct format but it has a class different than
datelifeQuery
, the class is not modified.
Is determined by the second condition.
Check if we obtained an empty search with the given taxon name(s).
is_datelife_result_empty(datelife_result, use_tnrs = FALSE)
is_datelife_result_empty(datelife_result, use_tnrs = FALSE)
datelife_result |
A |
use_tnrs |
Whether to use Open Tree of Life's Taxonomic Name Resolution Service (TNRS)
to process input taxon names. Default to |
Boolean. If TRUE
, no chronograms were found for the given taxon name(s).
If FALSE
, the chronogram search was successful.
Check if a tree is a valid chronogram.
is_good_chronogram(phy)
is_good_chronogram(phy)
phy |
A |
TRUE
if it is a valid tree.
This function implements definition 2.8 for n-overlap from Ané et al. (2009) doi:10.1007/s00026-009-0017-x.
is_n_overlap(names_1, names_2, n = 2)
is_n_overlap(names_1, names_2, n = 2)
names_1 |
First vector of names |
names_2 |
Second vector of names |
n |
Degree of overlap required |
Boolean for whether the degree of overlap was met or not.
Ané, C., Eulenstein, O., Piaggio-Talice, R., & Sanderson, M. J. (2009). "Groves of phylogenetic trees". Annals of Combinatorics, 13(2), 139-167, doi:10.1007/s00026-009-0017-x.
Find all authors and where they have deposited their trees
make_all_associations(outputfile = "depositorcache.RData")
make_all_associations(outputfile = "depositorcache.RData")
outputfile |
Path including file name. NULL to prevent saving. |
a data.frame
of "person" and "urls".
The function takes a tree topology and uses the BLADJ algorithm
implemented with phylocomr::ph_bladj()
to assign node ages and branch lengths, given a
set of fixed node ages and respective node names.
make_bladj_tree(tree = NULL, nodenames = NULL, nodeages = NULL)
make_bladj_tree(tree = NULL, nodenames = NULL, nodeages = NULL)
tree |
A tree either as a newick character string or as a |
nodenames |
A character vector with names of nodes in tree with known ages |
nodeages |
A numeric vector with the actual ages of named nodes |
Input tree
can be dated or not, $edge.length
is ignored.
Ages given in nodeages
are fixed on their corresponding nodes given in nodenames
.
A phylo
object.
make_bold_otol_tree
takes taxon names from a tree topology or
a vector of names to search for genetic markers in the Barcode of Life Database
(BOLD), create an alignment, and reconstruct branch lengths on a tree topology
with Maximum Likelihood.
make_bold_otol_tree( input = c("Rhea americana", "Struthio camelus", "Gallus gallus"), marker = "COI", otol_version = "v3", chronogram = TRUE, doML = FALSE, aligner = "muscle", ... )
make_bold_otol_tree( input = c("Rhea americana", "Struthio camelus", "Gallus gallus"), marker = "COI", otol_version = "v3", chronogram = TRUE, doML = FALSE, aligner = "muscle", ... )
input |
One of the following:
|
marker |
A character vector indicating the gene from BOLD system to be used for branch length estimation. |
otol_version |
Version of Open Tree of Life to use |
chronogram |
Default to |
doML |
Default to |
aligner |
A character vector indicating whether to use MAFFT or MUSCLE
to align BOLD sequences. It is not case sensitive. Default to MUSCLE,
supported using the msa
package from Bioconductor, which needs to be installed using |
... |
Arguments passed on to
|
If input
is a phylo
object or a newick string, it is used as backbone topology.
If input
is a character vector of taxon names, an induced synthetic OpenTree
subtree is used as backbone.
A phylo
object. If there are enough BOLD sequences available for the
input
taxon names, the function returns a tree with branch lengths proportional
to relative substitution rate. If not enough BOLD sequences are available
for the input
taxon names, the function returns the topology given as
input
, or a synthetic Open Tree of Life for the taxon names given in
input
, obtained with get_otol_synthetic_tree()
.
Create a cache from Open Tree of Life
make_contributor_cache(outputfile = "contributorcache.RData")
make_contributor_cache(outputfile = "contributorcache.RData")
outputfile |
Path including file name |
List containing author and curator results
datelifeQuery
objectGo from taxon names to a datelifeQuery
object
make_datelife_query( input = c("Rhea americana", "Pterocnemia pennata", "Struthio camelus"), use_tnrs = TRUE, get_spp_from_taxon = FALSE, reference_taxonomy = "ott" )
make_datelife_query( input = c("Rhea americana", "Pterocnemia pennata", "Struthio camelus"), use_tnrs = TRUE, get_spp_from_taxon = FALSE, reference_taxonomy = "ott" )
input |
Taxon names as one of the following:
|
use_tnrs |
Whether to use Open Tree of Life's Taxonomic Name Resolution Service (TNRS)
to process input taxon names. Default to |
get_spp_from_taxon |
Whether to search ages for all species belonging to a
given taxon or not. Default to |
reference_taxonomy |
A character vector specifying the reference taxonomy to use for TNRS. Options are "ott", "ncbi", "gbif" or "irmng". The function defaults to "ott". |
It processes phylo
objects and newick character string inputs
with input_process()
. If input
is a multiPhylo
object, only the first phylo
element will be used. Similarly, if an input
newick character string has multiple trees,
only the first one will be used.
A datelifeQuery
object, which is a list of three elements:
A phylo
object or NA
, if input is not a tree.
A character vector of cleaned taxon names.
A numeric vector of OTT ids if use_tnrs = TRUE
, or NULL
if use_tnrs = FALSE
.
datelifeQuery
objectGo from taxon names to a datelifeQuery
object
make_datelife_query2( input = c("Rhea americana", "Pterocnemia pennata", "Struthio camelus"), get_spp_from_taxon = FALSE, reference_taxonomy = "ott", ... )
make_datelife_query2( input = c("Rhea americana", "Pterocnemia pennata", "Struthio camelus"), get_spp_from_taxon = FALSE, reference_taxonomy = "ott", ... )
input |
Taxon names as one of the following:
|
get_spp_from_taxon |
Whether to search ages for all species belonging to a
given taxon or not. Default to |
reference_taxonomy |
A character vector specifying the reference taxonomy to use for TNRS. Options are "ott", "ncbi", "gbif" or "irmng". The function defaults to "ott". |
... |
Arguments passed on to
|
It processes phylo
objects and newick character string inputs
with input_process()
. If input
is a multiPhylo
object, only the first phylo
element will be used. Similarly, if an input
newick character string has multiple trees,
only the first one will be used.
A datelifeQuery
object, which is a list of four elements:
A character vector of input taxon names.
A character vector of taxon names processed with TNRS.
A numeric vector of OTT ids.
A phylo
object or NA
, if input is not a tree.
Make a mrBayes run block file with a constraint topology and a set of node calibrations and missing taxa
make_mrbayes_runfile( constraint = NULL, taxa = NULL, ncalibration = NULL, missing_taxa = NULL, age_distribution = "fixed", root_calibration = FALSE, mrbayes_output_file = "mrbayes_run.nexus" )
make_mrbayes_runfile( constraint = NULL, taxa = NULL, ncalibration = NULL, missing_taxa = NULL, age_distribution = "fixed", root_calibration = FALSE, mrbayes_output_file = "mrbayes_run.nexus" )
constraint |
The constraint tree: a phylo object or a newick character string, with or without branch lengths. |
taxa |
A character vector with taxon names to be maintained in tree |
ncalibration |
The node calibrations: a phylo object with branch lengths proportional to time; in this case all nodes from ncalibration will be used as calibration points. Alternatively, a list with two elements: the first is a character vector with node names from phy to calibrate; the second is a numeric vector with the corresponding ages to use as calibrations. |
missing_taxa |
A tree, a data frame or a vector enlisting all missing taxa you want to include.
|
age_distribution |
A character string specifying the type of calibration. Only "fixed" and "uniform" are implemented for now.
|
root_calibration |
Used to set a calibration at the root or not. Default to FALSE. Only relevant if ncalibration is specified. |
mrbayes_output_file |
A character vector specifying the name of mrBayes run file and outputs (can specify directory too). |
A MrBayes block run file in nexus format.
Take a constraint tree and use mrBayes to get node ages and branch lengths given a set of node calibrations without any data.
make_mrbayes_tree( constraint = NULL, taxa = NULL, ncalibration = NULL, missing_taxa = NULL, age_distribution = "fixed", root_calibration = FALSE, mrbayes_output_file = "mrbayes_run.nexus" )
make_mrbayes_tree( constraint = NULL, taxa = NULL, ncalibration = NULL, missing_taxa = NULL, age_distribution = "fixed", root_calibration = FALSE, mrbayes_output_file = "mrbayes_run.nexus" )
constraint |
The constraint tree: a phylo object or a newick character string, with or without branch lengths. |
taxa |
A character vector with taxon names to be maintained in tree |
ncalibration |
The node calibrations: a phylo object with branch lengths proportional to time; in this case all nodes from ncalibration will be used as calibration points. Alternatively, a list with two elements: the first is a character vector with node names from phy to calibrate; the second is a numeric vector with the corresponding ages to use as calibrations. |
missing_taxa |
A tree, a data frame or a vector enlisting all missing taxa you want to include.
|
age_distribution |
A character string specifying the type of calibration. Only "fixed" and "uniform" are implemented for now.
|
root_calibration |
Used to set a calibration at the root or not. Default to FALSE. Only relevant if ncalibration is specified. |
mrbayes_output_file |
A character vector specifying the name of mrBayes run file and outputs (can specify directory too). |
A phylo
object with branch lengths proportional to time. It saves all
mrBayes outputs in the working directory.
Associate Open Tree of Life authors with studies
make_otol_associations()
make_otol_associations()
data.frame
with author last name, author first and other names, and comma delimited URLs for OToL studies
Create an overlap table
make_overlap_table(results_table)
make_overlap_table(results_table)
results_table |
An "author.results" or "curator.results" |
A data.frame
with information on curators and what clades they've worked on
get_goodmatrices()
Make a Super Distance Matrix (SDM) from a list of good matrices obtained with get_goodmatrices()
make_sdm(unpadded.matrices, weighting = "flat")
make_sdm(unpadded.matrices, weighting = "flat")
unpadded.matrices |
A list of patristic matrices, a |
weighting |
A character vector indicating how much weight to give to each
tree in
Defaults to |
A matrix.
Associate TreeBase authors with studies
make_treebase_associations()
make_treebase_associations()
data.frame
with author last name, author first and other names, and comma delimited URLs for TreeBase studies
Create a cache from TreeBase
make_treebase_cache(outputfile = "treebasecache.RData")
make_treebase_cache(outputfile = "treebasecache.RData")
outputfile |
Path including file name |
List containing author and curator results
Add Open Tree of Life Taxonomy to tree nodes.
map_nodes_ott(tree)
map_nodes_ott(tree)
tree |
A tree either as a newick character string or as a |
A phylo
object with "nodelabels".
## Not run: # This is a flag for package development. You are welcome to run the example. # Load the Open Tree chronograms database cached in datelife: utils::data(opentree_chronograms) # Get the small chronograms (i.e., chronograms with less that ten tips) to generate a pretty plot: small <- opentree_chronograms$trees[unlist(sapply(opentree_chronograms$trees, ape::Ntip)) < 10] # Now, map the Open Tree taxonomy to the nodes of the first tree phy <- map_nodes_ott(tree = small[[1]]) # and plot it: # plot_phylo_all(phy) library(ape) plot(phy) nodelabels(phy$node.label) ## End(Not run) #end dontrun
## Not run: # This is a flag for package development. You are welcome to run the example. # Load the Open Tree chronograms database cached in datelife: utils::data(opentree_chronograms) # Get the small chronograms (i.e., chronograms with less that ten tips) to generate a pretty plot: small <- opentree_chronograms$trees[unlist(sapply(opentree_chronograms$trees, ape::Ntip)) < 10] # Now, map the Open Tree taxonomy to the nodes of the first tree phy <- map_nodes_ott(tree = small[[1]]) # and plot it: # plot_phylo_all(phy) library(ape) plot(phy) nodelabels(phy$node.label) ## End(Not run) #end dontrun
match_all_calibrations
searches a given tree for the most recent common
ancestor (mrca) of all taxon name pairs in a datelifeCalibration
. It uses phytools::findMRCA()
.
match_all_calibrations(phy, calibrations)
match_all_calibrations(phy, calibrations)
phy |
A |
calibrations |
A |
The function takes pairs of taxon names in a secondary calibrations data frame,
and looks for them in the vector of tip labels of the tree. If both are present,
then it gets the node that represents the most recent
common ancestor (mrca) for that pair of taxa in the tree.
Nodes of input phy
can be named or not.
A list of two elements:
A phylo
object with nodes renamed with tree_add_nodelabels()
.
A matchedCalibrations
object, which is the input calibrations
object with two additional columns storing results from the mrca search with
phytools::findMRCA()
: $mrca_node_number
and $mrca_node_name
.
Go from a list of patristic distance matrix to a table of node ages
matrices_to_table(matrices)
matrices_to_table(matrices)
matrices |
A names list of patristic distance matrices. Names correspond to the study reference. |
A single data.frame
of "taxonA", "taxonB", and "age".
Go from a patristic distance matrix to a node ages table
matrix_to_table(matrix, reference)
matrix_to_table(matrix, reference)
matrix |
A patristic distance matrix. |
reference |
A character vector with the study reference from where the ages come from. |
A data.frame
of "taxonA", "taxonB", and "age".
multiPhylo
inputMessage for a multiPhylo
input
message_multiphylo()
message_multiphylo()
A relevant message as a character string.
Checks that missing_taxa argument is ok to be used by make_mrbayes_runfile inside tree_add_dates functions.
missing_taxa_check(missing_taxa = NULL, dated_tree = NULL)
missing_taxa_check(missing_taxa = NULL, dated_tree = NULL)
missing_taxa |
A tree, a data frame or a vector enlisting all missing taxa you want to include.
|
dated_tree |
a tree (newick or phylo) with branch lengths proportional to absolute time |
A phylo object, a newick character string or a dataframe with taxonomic assignations
calibrations
objectmrca_calibrations
get nodes of a tree topology given in
phy
that correspond to the most recent common ancestor (mrca) of taxon
pairs given in calibrations
. It uses phytools::findMRCA()
to get mrca nodes.
mrca_calibrations(phy, calibrations)
mrca_calibrations(phy, calibrations)
phy |
A |
calibrations |
A |
The function takes pairs of taxon names in a calibrations data frame,
and looks for them in the vector of tip labels of the tree. If both are present,
then it gets the node that represents the most recent
common ancestor (mrca) for that pair of taxa in the tree.
Nodes of input phy
can be named or not. They will be renamed.
A list of two elements:
A phylo
object with nodes renamed to match results of
the mrca search. Nodes are renamed using tree_add_nodelabels()
.
A matchedCalibrations
object, which is the input calibrations
object with two additional columns storing results from the mrca search with
phytools::findMRCA()
: $mrca_node_number
and $mrca_node_name
.
Now storing >200 chronograms from Open Tree of Life
opentree_chronograms
opentree_chronograms
A list of four elements, containing data from Open Tree of Life chronograms
A list of lists of author names of the original studies that published chronograms in the Open Tree of Life database.
A list of lists of curator names that uploaded chronograms to the Open Tree of Life database.
A list of study identifiers.
A multiPhylo
object storing the chronograms from Open Tree of
Life database.
A character vector indicating the time when the database object was last updated.
A character vector indicating the datelife utils::packageVersion()
when the database was last updated.
Generated with opentree_chronograms <- get_opentree_chronograms() opentree_chronograms$version <- '2023.12.30' usethis::use_data(opentree_chronograms, overwrite = T, compress = "xz") and updated with update_datelife_cache()
patristic_matrix_array_congruify
is used for patristic_matrix_array_subset_both and patristic_matrix_array_congruify.patristic_matrix_array_congruify
is used for patristic_matrix_array_subset_both and patristic_matrix_array_congruify.
patristic_matrix_array_congruify( patristic_matrix_array, taxa, phy = NULL, dating_method = "PATHd8" )
patristic_matrix_array_congruify( patristic_matrix_array, taxa, phy = NULL, dating_method = "PATHd8" )
patristic_matrix_array |
A patristic matrix array, |
taxa |
Vector of taxon names to get a subset for. |
phy |
A user tree to congruify as |
dating_method |
The method used for tree dating. |
A patristic matrix with ages for the target taxa.
phylo
object.Congruify a patristic matrix array from a given phylo
object.
patristic_matrix_array_phylo_congruify( patristic_matrix, target_tree, dating_method = "PATHd8", attempt_fix = TRUE )
patristic_matrix_array_phylo_congruify( patristic_matrix, target_tree, dating_method = "PATHd8", attempt_fix = TRUE )
patristic_matrix |
A patristic matrix, |
target_tree |
A |
dating_method |
The method used for tree dating. |
attempt_fix |
Default to |
A matrix.
Split a patristic matrix array Used inside: patristic_matrix_array_congruify
patristic_matrix_array_split(patristic_matrix_array)
patristic_matrix_array_split(patristic_matrix_array)
patristic_matrix_array |
A patristic matrix array, |
A patristic matrix 3d array.
Subset a patristic matrix array
patristic_matrix_array_subset(patristic_matrix_array, taxa, phy4 = NULL)
patristic_matrix_array_subset(patristic_matrix_array, taxa, phy4 = NULL)
patristic_matrix_array |
A patristic matrix array, |
taxa |
Vector of taxon names to get a subset for. |
phy4 |
A user tree to congruify in |
A list with a patristic matrix array and a $problem
if any.
patristic_matrix_array_subset_both
is used inside get_subset_array_dispatch()
.
patristic_matrix_array_subset_both( patristic_matrix_array, taxa, phy = NULL, phy4 = NULL, dating_method = "PATHd8" )
patristic_matrix_array_subset_both( patristic_matrix_array, taxa, phy = NULL, phy4 = NULL, dating_method = "PATHd8" )
patristic_matrix_array |
A patristic matrix array, |
taxa |
Vector of taxon names to get a subset for. |
phy |
A user tree to congruify as |
phy4 |
A user tree to congruify in |
dating_method |
The method used for tree dating. |
A patristic matrix with ages for the target taxa.
patristic_matrix_list_to_array
us ised inside summarize_datelife_result()
, patristic_matrix_array_congruify()
.
patristic_matrix_list_to_array(patristic_matrix_list, pad = TRUE)
patristic_matrix_list_to_array(patristic_matrix_list, pad = TRUE)
patristic_matrix_list |
List of patristic matrices |
pad |
If TRUE, pad missing entries |
A 3d array of patristic matrices
datelife_result_MRCA()
.Get time of MRCA from patristic matrix. Used in datelife_result_MRCA()
.
patristic_matrix_MRCA(patristic_matrix, na_rm = TRUE)
patristic_matrix_MRCA(patristic_matrix, na_rm = TRUE)
patristic_matrix |
A patristic matrix (aka a |
na_rm |
If |
The depth of the MRCA as a numeric vector.
patristic_matrix_name_order_test
is only used in patristic_matrix_list_to_array()
.
patristic_matrix_name_order_test( patristic_matrix, standard.rownames, standard.colnames )
patristic_matrix_name_order_test( patristic_matrix, standard.rownames, standard.colnames )
patristic_matrix |
A patristic matrix, |
standard.rownames |
A character vector of row names. |
standard.colnames |
A character vector of column names. |
Boolean.
patristic_matrix_name_reorder
is only used in: patristic_matrix_pad()
.
patristic_matrix_name_reorder(patristic_matrix)
patristic_matrix_name_reorder(patristic_matrix)
patristic_matrix |
A patristic matrix, |
A patristic matrix with row and column names for taxa in alphabetical order.
Used in: patristic_matrix_list_to_array()
.
patristic_matrix_pad(patristic_matrix, all_taxa)
patristic_matrix_pad(patristic_matrix, all_taxa)
patristic_matrix |
A patristic matrix, |
all_taxa |
A vector of names of all taxa you want, including ones not in the patristic matrix. |
A patristic matrix, with NA
for entries between taxa
where at least one was not in the original patristic matrix.
patristic_matrix_taxa_all_matching
is used inside: results_list_process()
.
patristic_matrix_taxa_all_matching(patristic_matrix, taxa)
patristic_matrix_taxa_all_matching(patristic_matrix, taxa)
patristic_matrix |
A patristic matrix, |
taxa |
Vector of taxon names to get a subset for. |
A Boolean.
Convert patristic matrix to a newick string. Used inside: summarize_datelife_result.
patristic_matrix_to_newick(patristic_matrix)
patristic_matrix_to_newick(patristic_matrix)
patristic_matrix |
A patristic matrix |
A newick string
phylo
object.Function patristic_matrix_to_phylo
is used inside summarize_datelife_result()
.
patristic_matrix_to_phylo( patristic_matrix, clustering_method = "nj", fix_negative_brlen = TRUE, fixing_method = 0, ultrametric = TRUE, variance_matrix = NULL )
patristic_matrix_to_phylo( patristic_matrix, clustering_method = "nj", fix_negative_brlen = TRUE, fixing_method = 0, ultrametric = TRUE, variance_matrix = NULL )
patristic_matrix |
A patristic matrix |
clustering_method |
A character vector indicating the method to construct the tree. Options are:
|
fix_negative_brlen |
Boolean indicating whether to fix negative branch
lengths in resulting tree or not. Default to |
fixing_method |
A character vector specifying the method to fix branch lengths: "bladj", "mrbayes" or a number to be assigned to all branches meeting fixing_criterion |
ultrametric |
Boolean indicating whether to force ultrametric or not. |
variance_matrix |
A variance matrix from a |
We might add the option to insert a function as clustering_method
in the future.
Before, we had hard-coded the function to try Neighbor-Joining (NJ) first; if it
errors, it will try UPGMA.
Now, it uses NJ for a "phylo_all" summary, and we are using our own algorithm to
get a tree from a summary matrix.
A rooted phylo
object.
datelifeResult
object.Used in datelife_result_sdm_phylo()
.
patristic_matrix_unpad(patristic_matrix)
patristic_matrix_unpad(patristic_matrix)
patristic_matrix |
A patristic matrix with row and column names for taxa |
patristic_matrix for all_taxa
phy
is a phylo
object and/or a chronogram.Checks if phy
is a phylo
object and/or a chronogram.
phylo_check(phy = NULL, brlen = FALSE, dated = FALSE)
phylo_check(phy = NULL, brlen = FALSE, dated = FALSE)
phy |
A |
brlen |
Boolean. If |
dated |
Boolean. If |
Nothing
phylo
objects.Congruify a reference tree and a target tree given as phylo
objects.
phylo_congruify( reference_tree, target_tree, dating_method = "PATHd8", attempt_fix = TRUE )
phylo_congruify( reference_tree, target_tree, dating_method = "PATHd8", attempt_fix = TRUE )
reference_tree |
A |
target_tree |
A |
dating_method |
The method used for tree dating. |
attempt_fix |
Default to |
A matrix.
Generate uncertainty in branch lengths using a lognormal.
phylo_generate_uncertainty( phy, size = 100, uncertainty_method = "other", age_distribution = "uniform", age_sd = NULL, age_var = 0.1, age_scale = 0, alpha = 0.025, rescale = TRUE )
phylo_generate_uncertainty( phy, size = 100, uncertainty_method = "other", age_distribution = "uniform", age_sd = NULL, age_var = 0.1, age_scale = 0, alpha = 0.025, rescale = TRUE )
phy |
A |
size |
A numeric vector indicating the number of samples to be generated. |
uncertainty_method |
A character vector specifying the method to generate uncertainty. mrbayes is default. |
age_distribution |
A character string specifying the type of calibration. Only "fixed" and "uniform" are implemented for now.
|
age_sd |
The standard deviation around the age to use for generating the uncertainty. If not a numeric value, var will be used to calculate it. |
age_var |
The variance to calculate age_sd and generate uncertainty. |
age_scale |
How to scale sd by the depth of the node. If 0, same sd for all. If not, older nodes have more uncertainty |
alpha |
The significance level on uncertainty to generate. By default 0.025 |
rescale |
Boolean. If true, observed age will be rescaled each round. |
If you want to change the size of sampled trees you do not need to run mrbayes again. Just use sample_trees("mrbayes_trees_file_directory", size = new_size) and you will get a multiPhylo object with a new tree sample.
A phylo or multiPhylo object with the same topology as phy but different branch lengths
## Not run: # Generate uncertainty over feline species SDM chronogram. # Load the data: data(felid_sdm) # By default, generates a sample of 100 trees with var = 0.1: unc <- phylo_generate_uncertainty(felid_sdm$phy) length(unc) # Make an LTT plot: max_age <- max(sapply(unc, ape::branching.times)) ape::ltt.plot(phy = unc[[1]], xlim = c(-max_age, 0), col = "#cce5ff50") for (i in 2:100) { ape::ltt.lines(phy = unc[[i]], col = "#cce5ff50") } ape::ltt.lines(felid_sdm$phy, col = "red") title(c("fake uncertainty", "in Felidae SDM chronogram")) ## End(Not run) # end dontrun
## Not run: # Generate uncertainty over feline species SDM chronogram. # Load the data: data(felid_sdm) # By default, generates a sample of 100 trees with var = 0.1: unc <- phylo_generate_uncertainty(felid_sdm$phy) length(unc) # Make an LTT plot: max_age <- max(sapply(unc, ape::branching.times)) ape::ltt.plot(phy = unc[[1]], xlim = c(-max_age, 0), col = "#cce5ff50") for (i in 2:100) { ape::ltt.lines(phy = unc[[i]], col = "#cce5ff50") } ape::ltt.lines(felid_sdm$phy, col = "red") title(c("fake uncertainty", "in Felidae SDM chronogram")) ## End(Not run) # end dontrun
Gets node numbers from any phylogeny
phylo_get_node_numbers(phy)
phylo_get_node_numbers(phy)
phy |
A |
A numeric vector with node numbers
phylo
objectGet a subset array from a phylo
object
phylo_get_subset_array( reference_tree, taxa, phy4 = NULL, dating_method = "PATHd8" )
phylo_get_subset_array( reference_tree, taxa, phy4 = NULL, dating_method = "PATHd8" )
reference_tree |
A |
taxa |
Vector of taxon names to get a subset for. |
phy4 |
A user tree to congruify in |
dating_method |
The method used for tree dating. |
A list with a patristic matrix array and a $problem
if any.
phylo
objectGet a congruified subset array from a phylo
object
phylo_get_subset_array_congruify( reference_tree, taxa, phy = NULL, dating_method = "PATHd8" )
phylo_get_subset_array_congruify( reference_tree, taxa, phy = NULL, dating_method = "PATHd8" )
reference_tree |
A |
taxa |
Vector of taxon names to get a subset for. |
phy |
A user tree to congruify as |
dating_method |
The method used for tree dating. |
A list with a patristic matrix array and a $problem
if any.
Check if a tree has branch lengths
phylo_has_brlen(phy)
phylo_has_brlen(phy)
phy |
A |
A TRUE or FALSE
phylo
object
Used inside phylo_get_subset_array and phylo_get_subset_array_congruify.Prune missing taxa from a phylo
object
Used inside phylo_get_subset_array and phylo_get_subset_array_congruify.
phylo_prune_missing_taxa(phy, taxa)
phylo_prune_missing_taxa(phy, taxa)
phy |
A user tree to congruify as |
taxa |
Vector of taxon names to get a subset for. |
A phylo
object.
phylo
objects.Subset a reference and a target tree given as phylo
objects.
phylo_subset_both( reference_tree, taxa, phy = NULL, phy4 = NULL, dating_method = "PATHd8" )
phylo_subset_both( reference_tree, taxa, phy = NULL, phy4 = NULL, dating_method = "PATHd8" )
reference_tree |
A |
taxa |
Vector of taxon names to get a subset for. |
phy |
A user tree to congruify as |
phy4 |
A user tree to congruify in |
dating_method |
The method used for tree dating. |
A list with a patristic matrix array and a $problem
if any.
phylo_tiplabel_space_to_underscore
is used in: make_mrbayes_runfile()
,
tree_get_singleton_outgroup()
,
congruify_and_check()
, patristic_matrix_array_phylo_congruify()
.
phylo_tiplabel_space_to_underscore(phy)
phylo_tiplabel_space_to_underscore(phy)
phy |
A |
A phylo
object.
phylo_tiplabel_underscore_to_space
is used inside patristic_matrix_array_phylo_congruify()
, congruify_and_check()
.
phylo_tiplabel_underscore_to_space(phy)
phylo_tiplabel_underscore_to_space(phy)
phy |
A |
A phylo
object.
phylo
object.Get a patristic matrix from a phylo
object.
phylo_to_patristic_matrix(phy, test = TRUE, tol = 0.01, option = 2)
phylo_to_patristic_matrix(phy, test = TRUE, tol = 0.01, option = 2)
phy |
A |
test |
Default to |
tol |
branching time in |
option |
an integer (1 or 2; see details). |
A patristic matrix.
Pick a grove in the case of multiple groves in a set of trees.
pick_grove(grove_list, criterion = "taxa", datelife_result)
pick_grove(grove_list, criterion = "taxa", datelife_result)
grove_list |
A list of vectors of tree indices. Each element is a grove. |
criterion |
Defaults to |
datelife_result |
A |
A numeric vector of the elements of the picked grove.
Some plants chronogram
plant_bold_otol_tree
plant_bold_otol_tree
A phylo object with 6 tips and 5 internal nodes
Integer vector with edge (branch) numbers
Character vector with species names of plants
Integer vector with the number of nodes
Character vector with node names
Numeric vector with edge (branch) lengths
Generated with make_bold_otol_tree(input = "((Zea mays,Oryza sativa),((Arabidopsis thaliana,(Glycine max,Medicago sativa)),Solanum lycopersicum)Pentapetalae);") usethis::use_data(plant_bold_otol_tree)
Luna L. Sanchez-Reyes [email protected]
Brian O'Meara [email protected]
Problematic chronograms from Open Tree of Life.
problems
problems
A list of trees with unmapped taxa
Before we developed tools to clean and map tip labels for our cached trees we found some trees that were stored with unmapped tip labels we extracted them and saved them to be used for testing functions. Generated with problems <- opentree_chronograms$trees[sapply(sapply(opentree_chronograms$trees, "[", "tip.label"), function(x) any(grepl("not.mapped", x)))] usethis::use_data(problems) opentree_chronograms object from commit https://github.com/phylotastic/datelife/tree/be894448f6fc437241cd0916fab45e84ac3e09c6
[", "tip.label"), function(x) any(grepl("not.mapped", x)))]: R:%22,%20%22tip.label%22),%20function(x)%20any(grepl(%22not.mapped%22,%20x)))
Get an mrcaott tag from an OpenTree induced synthetic tree and get its name and ott id
recover_mrcaott(tag)
recover_mrcaott(tag)
tag |
A character vector with the mrca tag |
A numeric vector with ott id from original taxon named with the corresponding ott name
Return the relevant curators for a set of studies.
relevant_curators_tabulate(results.index, cache = "opentree_chronograms")
relevant_curators_tabulate(results.index, cache = "opentree_chronograms")
results.index |
A vector from |
cache |
The cached chronogram database. |
A vector with counts of each curator, with names equal to curator names.
results_list_process
is used inside: get_datelife_result()
results_list_process(results_list, taxa = NULL, partial = FALSE)
results_list_process(results_list, taxa = NULL, partial = FALSE)
results_list |
A |
taxa |
Vector of taxon names to get a subset for. |
partial |
If |
A list with the patristic.matrices that are not NA
.
Core function to generate results
run( input = c("Rhea americana", "Pterocnemia pennata", "Struthio camelus"), format = "citations", partial = "yes", plot.width = 600, plot.height = 600, use_tnrs = "no", opentree_chronograms = NULL )
run( input = c("Rhea americana", "Pterocnemia pennata", "Struthio camelus"), format = "citations", partial = "yes", plot.width = 600, plot.height = 600, use_tnrs = "no", opentree_chronograms = NULL )
input |
A newick string or vector of taxa |
format |
The output format |
partial |
How to deal with trees that have a subset of taxa in the query |
plot.width |
Width in pixels for output plot |
plot.height |
Height in pixels for output plot |
use_tnrs |
Whether to use OpenTree's TNRS for the input |
opentree_chronograms |
The list of lists containing the input trees and other info |
results in the desired format
Runs MrBayes from R
run_mrbayes(mrbayes_output_file = NULL)
run_mrbayes(mrbayes_output_file = NULL)
mrbayes_output_file |
A character vector specifying the name of mrBayes run file and outputs (can specify directory too). |
A phylo object with the consensus tree. MrBayes output files are stored in the working directory.
Sample trees from a file containing multiple trees. Usually from a bayesian analysis output trees file.
sample_trees(trees_file, trees_object = NULL, burnin = 0.25, size = 100)
sample_trees(trees_file, trees_object = NULL, burnin = 0.25, size = 100)
trees_file |
A character vector indicating the name and directory of file with trees to sample. |
trees_object |
An R object containing a list of trees already read into R from a tree file from a bayesian analysis output. |
burnin |
A numeric vector indicating the burnin fraction. It should be a number between 0 and 1. Default to 0.25 |
size |
A numeric vector indicating the number of samples to be generated. |
A multiPhylo
object with a random sample of trees.
datelifeResult object of some ants
some_ants_datelife_result
some_ants_datelife_result
A list of one element, containing a named patristic matrix
Generated with: some_ants_input <- "(Aulacopone_relicta,(((Myrmecia_gulosa,(Aneuretus_simoni,Dolichoderus_mariae)),((Ectatomma_ruidum,Huberia_brounii),Formica_rufa)),Apomyrma_stygia),Martialis_heureka)Formicidae;" some_ants_datelife_query <- make_datelife_query(input = some_ants_input) some_ants_datelife_result <- get_datelife_result(input = some_ants_datelife_query) usethis::use_data(some_ants_datelife_result)
A list with datelifeQuery and datelifeResult objects from a search of taxon names from subset2_taxa
subset2_search
subset2_search
A list with two named elements. datelifeResult object with 24 patristic matrices
A datelifeQuery object using names_subset 2 as input.
A datelifeResult object resulting from a search of names in datelifeQuery
Generated with: datelife_query <- make_datelife_query(subset2_taxa) datelife_result <- get_datelife_result(datelife_query) subset2_search <- list(query = datelife_query, result = datelife_result) usethis::use_data(subset2_search, overwrite = TRUE)
Long list of >2.7k virus, bacteria, plant and animal taxon names
subset2_taxa
subset2_taxa
A character vector of length 2778
Generated with: subset2_taxa <- rphylotastic::url_get_scientific_names("https://github.com/phylotastic/rphylotastic/blob/master/tests/testthat/subset2.txt") usethis::use_data(subset2_taxa)
https://github.com/phylotastic/rphylotastic/tree/master/tests/testthat
congruifiedCalibrations
object.Function summarize_congruifiedCalibrations
returns a table of
summary statistics for each node in congruified_calibrations
argument.
summarize_congruifiedCalibrations(congruified_calibrations, age_column)
summarize_congruifiedCalibrations(congruified_calibrations, age_column)
congruified_calibrations |
A |
age_column |
A character string indicating the name of the column to be summarized. |
A data.frame
of summarized ages.
datelifeResult
object.Get different types of summaries from a datelifeResult
object, an output from get_datelife_result()
.
This allows rapid processing of data.
If you need a list of chronograms from your datelifeResult
object, this
is the function you are looking for.
summarize_datelife_result( datelife_result = NULL, datelife_query = NULL, summary_format = "phylo_all", na_rm = TRUE, summary_print = c("citations", "taxa"), taxon_summary = c("none", "summary", "matrix"), criterion = "taxa" )
summarize_datelife_result( datelife_result = NULL, datelife_query = NULL, summary_format = "phylo_all", na_rm = TRUE, summary_print = c("citations", "taxa"), taxon_summary = c("none", "summary", "matrix"), criterion = "taxa" )
datelife_result |
A |
datelife_query |
A |
summary_format |
A character vector of length one, indicating the output format for results of the DateLife search. Available output formats are:
|
na_rm |
If |
summary_print |
A character vector specifying the type of summary information to be printed to screen. Options are:
Defaults to |
taxon_summary |
A character vector specifying if data on target taxa missing
in source chronograms should be added to the output as a |
criterion |
Defaults to |
The output is determined by the argument summary_format
:
summary_format = "citations"
The function returns a character vector of references.
summary_format = "mrca"
The function returns a named numeric vector of most recent common ancestor (mrca) ages.
summary_format = "newick_[all, sdm, or median]"
The function returns output chronograms as newick strings.
summary_format = "phylo_[all, sdm, median, or biggest]"
The
function returns output chronograms as phylo
or multiPhylo
objects.
summary_format = "html" or "data_frame"
The function returns a 4 column table with data on mrca ages, number of taxa, references, and output chronograms as newick strings.
Ané, C., Eulenstein, O., Piaggio-Talice, R., & Sanderson, M. J. (2009). "Groves of phylogenetic trees". Annals of Combinatorics, 13(2), 139-167, doi:10.1007/s00026-009-0017-x.
This uses the Paleobiology Database's API to gather information on the ages for all specimens of a taxon. It will also look for all descendants of the taxon. It fixes name misspellings if possible. It is basically a wrapper for get_fossil_range.
summarize_fossil_range(taxon, recent = FALSE, assume_recent_if_missing = TRUE)
summarize_fossil_range(taxon, recent = FALSE, assume_recent_if_missing = TRUE)
taxon |
The scientific name of the taxon you want the range of occurrences of |
recent |
If TRUE, forces the minimum age to be zero |
assume_recent_if_missing |
If TRUE, any taxon missing from pbdb is assumed to be recent |
a single row data.frame of max_ma and min_ma for the specimens, with rowname equal to taxon input
Gets all ages per taxon pair from a distance matrix Internal function used in summary_matrix_to_phylo_all().
summarize_summary_matrix(summ_matrix)
summarize_summary_matrix(summ_matrix)
summ_matrix |
Any summary patristic distance matrix, such as the ones obtained with |
A data.frame
of pairwise ages, with row number equal to the combinatory
of column names (or row names), estimated as ncol(summ_matrix)^2 - sum(1:(ncol(summ_matrix)-1))
.
phylo
object.Go from a summary matrix to an ultrametric phylo
object.
summary_matrix_to_phylo( summ_matrix, datelife_query = NULL, target_tree = NULL, total_distance = TRUE, use = "mean", ... )
summary_matrix_to_phylo( summ_matrix, datelife_query = NULL, target_tree = NULL, total_distance = TRUE, use = "mean", ... )
summ_matrix |
Any summary patristic distance matrix, such as the ones obtained with |
datelife_query |
A |
target_tree |
A |
total_distance |
Whether the input |
use |
A character vector indicating what type of age to use for summary tree. One of the following:
|
... |
Arguments passed on to |
It can take a regular patristic distance matrix, but there are simpler
methods for that implemented in patristic_matrix_to_phylo()
.
An ultrametric phylo object.
datelifeResult
object.Get minimum, median, mean, midpoint, and maximum summary chronograms from a
summary matrix of a datelifeResult
object.
summary_matrix_to_phylo_all( summ_matrix, datelife_query = NULL, target_tree = NULL, total_distance = TRUE, ... )
summary_matrix_to_phylo_all( summ_matrix, datelife_query = NULL, target_tree = NULL, total_distance = TRUE, ... )
summ_matrix |
Any summary patristic distance matrix, such as the ones obtained with |
datelife_query |
A |
target_tree |
A |
total_distance |
Whether the input |
... |
Arguments passed on to
|
With this function users can choose the minimum, mean or maximum ages from
the summary matrix as calibration points to get a single summary chronogram.
Users get all three summary chronograms in a multiPhylo
object.
A multiPhylo
object of length 5. It contains min, mean, median, midpoint, and max summary chronograms.
Summarize patristic matrix array (by default, median). Used inside: summarize_datelife_result.
summary_patristic_matrix_array(patristic_matrix_array, fn = stats::median)
summary_patristic_matrix_array(patristic_matrix_array, fn = stats::median)
patristic_matrix_array |
3D array of patristic matrices |
fn |
The function to use to summarize |
A 2d array with the median (or max, or mean, etc) of the input array
datelifeResult
object.Summarize a datelifeResult
object.
## S3 method for class 'datelifeResult' summary(object, datelife_query, na_rm = TRUE, ...)
## S3 method for class 'datelifeResult' summary(object, datelife_query, na_rm = TRUE, ...)
object |
An object of class |
datelife_query |
A |
na_rm |
Default to |
... |
Further arguments passed to or from other methods. |
A named list
of 11 elements:
A character vector of references where chronograms with some or all of the target taxa are published (source chronograms).
A named numeric vector of most recent common ancestor (mrca) ages of target taxa defined in input, obtained from the source chronograms. Names of mrca vector are equal to citations.
A named character vector of newick strings corresponding to target chronograms derived from source chronograms. Names of newick_all vector are equal to citations.
Only if multiple source chronograms are available. A character vector with a single newick string corresponding to a target chronogram obtained with SDM supertree method (Criscuolo et al. 2006).
Only if multiple source chronograms are available. A character vector with a single newick string corresponding to a target chronogram from the median of all source chronograms.
Only if multiple source chronograms are available. A phylo object with a single target chronogram obtained with SDM supertree method (Criscuolo et al. 2006).
Only if multiple source chronograms are available. A phylo object with a single target chronogram obtained from source chronograms with median method.
A named list of phylo objects corresponding to each target chronogram obtained from available source chronograms. Names of phylo_all list correspond to citations.
The chronogram with the most taxa. In the case of a tie, the chronogram with clade age closest to the median age of the equally large trees is returned.
A character vector with an html string that can be saved and then opened in any web browser. It contains a 4 column table with data on target taxa: mrca, number of taxa, citations of source chronogram and newick target chronogram.
A 4 column data.frame
with data on target taxa: mrca, number of
taxa, citations of source chronograms and newick string.
matchedCalibrations
object
summary.matchedCalibrations
gets the node age distribution from a matchedCalibrations
object.Summarize a matchedCalibrations
object
summary.matchedCalibrations
gets the node age distribution from a matchedCalibrations
object.
## S3 method for class 'matchedCalibrations' summary(object, ...)
## S3 method for class 'matchedCalibrations' summary(object, ...)
object |
A |
... |
Further arguments passed to or from other methods. |
Columns in_phy$mrca_node_name
and in_phy$reference
are factors.
A summaryMatchedCalibrations
object, which is a list of two matchedCalibrations
objects:
A data.frame
subset of input matchedCalibrations
object
containing taxon name pairs that were not present in the given tree. NULL
if all input taxon names are found in the given tree.
A data.frame
subset of input matchedCalibrations
object
containing all taxon name pairs that were present in the given tree.
datelifeResult
object of three birds "Rhea americana", "Pterocnemia pennata", and "Struthio camelus"datelifeResult
object of three birds "Rhea americana", "Pterocnemia pennata", and "Struthio camelus"
threebirds_dr
threebirds_dr
A list of 9 named patristic matrix
Generated with: threebirds_dr <- get_datelife_result(input=c("Rhea americana", "Pterocnemia pennata", "Struthio camelus"), partial = TRUE, use_tnrs = FALSE, approximate_match = TRUE, cache = "opentree_chronograms") use_data(threebirds_dr)
Taxon name resolution service (tnrs) applied to a vector of names by batches
tnrs_match(input, reference_taxonomy, tip, ...) ## Default S3 method: tnrs_match(input, reference_taxonomy = "ott", ...) ## S3 method for class 'phylo' tnrs_match(input, reference_taxonomy = "ott", tip = NULL, ...)
tnrs_match(input, reference_taxonomy, tip, ...) ## Default S3 method: tnrs_match(input, reference_taxonomy = "ott", ...) ## S3 method for class 'phylo' tnrs_match(input, reference_taxonomy = "ott", tip = NULL, ...)
input |
A character vector of taxon names, or a phylo object with tip names, to be matched to taxonomy. |
reference_taxonomy |
A character vector specifying the reference taxonomy to use for TNRS. Options are "ott", "ncbi", "gbif" or "irmng". The function defaults to "ott". |
tip |
A vector of mode numeric or character specifying the tips to match. If left empty all tips will be matched. |
... |
Arguments passed on to
|
There is no limit to the number of names that can be queried and matched.
The output will preserve all elements from original input phylo object and will add
A character vector indicating the state of mapping of phy$tip.labels:
Tnrs matching was not attempted. Original labeling is preserved.
Matching was manually made by a curator in Open Tree of Life.
Tnrs matching was attempted and successful with no approximate matching. Original label is replaced by the matched name.
Tnrs matching was attempted and successful but with approximate matching. Original labeling is preserved.
Tnrs matching was attempted and unsuccessful. Original labeling is preserved.
A character vector preserving all original labels.
A numeric vector with ott id numbers of matched tips. Unmatched and original tips will be NaN.
if tips are duplicated, tnrs will only be run once (avoiding increases in function running time) but the result will be applied to all duplicated tip labels
An object of class data frame or phylo, with the added class match_names.
NULL
NULL
tnrs_match(input = c("Mus")) tnrs_match(input = c("Mus", "Mus musculus")) tnrs_match(input = c("Mus", "Echinus", "Hommo", "Mus"))
tnrs_match(input = c("Mus")) tnrs_match(input = c("Mus", "Mus musculus")) tnrs_match(input = c("Mus", "Echinus", "Hommo", "Mus"))
This function adds missing taxa to a chronogram given in dated_tree
.
It is still work in progress.
tree_add_dates( dated_tree = NULL, missing_taxa = NULL, dating_method = "mrbayes", adding_criterion = "random", mrbayes_output_file = "mrbayes_tree_add_dates.nexus" )
tree_add_dates( dated_tree = NULL, missing_taxa = NULL, dating_method = "mrbayes", adding_criterion = "random", mrbayes_output_file = "mrbayes_tree_add_dates.nexus" )
dated_tree |
a tree (newick or phylo) with branch lengths proportional to absolute time |
missing_taxa |
A tree, a data frame or a vector enlisting all missing taxa you want to include.
|
dating_method |
The method used for tree dating, options are "mrbayes" and "bladj". |
adding_criterion |
Only used when
|
mrbayes_output_file |
A character vector specifying the name of mrBayes run file and outputs (can specify directory too). |
A phylo
object.
Adds labels to nodes with no assigned label
tree_add_nodelabels(tree = NULL, node_prefix = "n", node_index = "node_number")
tree_add_nodelabels(tree = NULL, node_prefix = "n", node_index = "node_number")
tree |
A tree either as a newick character string or as a |
node_prefix |
Character vector. If length 1, it will be used to name all nodes with no labels, followed by a number which can be the node_number or consecutive, as specified in node_index. |
node_index |
Character vector. Choose between "from_1" and "node_number" as numeric index for node labels. It will use consecutive numbers from 1 to total node number in the first case and phylo node numbers in the second case (i.e, from Ntip + 1). |
A phylo object
Function to add an outgroup to any phylogeny, in phylo or newick format
tree_add_outgroup(tree = NULL, outgroup = "outgroup")
tree_add_outgroup(tree = NULL, outgroup = "outgroup")
tree |
A tree either as a newick character string or as a |
outgroup |
A character vector with the name of the outgroup. If it has length>1, only first element will be used. |
A phylo object with no root edge.
Checks if a tree is a phylo class object otherwise it uses input_process. Additionally it can check if tree is a chronogram with phylo_check
tree_check(tree = NULL, ...)
tree_check(tree = NULL, ...)
tree |
A tree either as a newick character string or as a |
... |
Arguments passed on to
|
If tree is correctly formatted, it returns a phylo
object.
Take a tree with branch lengths and fix negative or zero length branches.
tree_fix_brlen( tree = NULL, fixing_criterion = "negative", fixing_method = 0, ultrametric = TRUE )
tree_fix_brlen( tree = NULL, fixing_criterion = "negative", fixing_method = 0, ultrametric = TRUE )
tree |
A tree either as a newick character string or as a |
fixing_criterion |
A character vector specifying the type of branch length to be fixed: "negative" or "zero" (the number 0 is also allowed). |
fixing_method |
A character vector specifying the method to fix branch lengths: "bladj", "mrbayes" or a number to be assigned to all branches meeting fixing_criterion |
ultrametric |
Boolean indicating whether to force ultrametric or not. |
A phylo
object with no negative or zero branch lengths.
This uses the taxize package's wrapper of the Global Names Resolver to get taxonomic paths for the vector of taxa you pass in. Sources is a vector of source labels in order (though it works best if everything uses the same taxonomy, so we recommend doing just one source). You can see options by doing taxize::gnr_datasources(). Our default is Catalogue of Life. The output is a phylo object (typically with many singleton nodes if collapse_singles is FALSE: nodes with only one descendant (like "Homo" having "Homo sapiens" as its only descendant) but these singletons typically have node.labels
tree_from_taxonomy( taxa, sources = "Catalogue of Life", collapse_singles = TRUE )
tree_from_taxonomy( taxa, sources = "Catalogue of Life", collapse_singles = TRUE )
taxa |
Vector of taxon names |
sources |
Vector of names of preferred sources; see taxize::gnr_datasources(). Currently supports 100 taxonomic resources, see details. |
collapse_singles |
If true, collapses singleton nodes |
A list containing a phylo object with resolved names and a vector with unresolved names
## Not run: # This is a flag for package development. You are welcome to run the example. taxa <- c( "Homo sapiens", "Ursus arctos", "Pan paniscus", "Tyrannosaurus rex", "Ginkgo biloba", "Vulcan", "Klingon" ) results <- tree_from_taxonomy(taxa) print(results$unresolved) # The taxa that do not match ape::plot.phylo(results$phy) # may generate warnings due to problems with singletons ape::plot.phylo(ape::collapse.singles(results$phy), show.node.label = TRUE) # got rid of singles, but this also removes a lot of the node.labels ## End(Not run) # end dontrun
## Not run: # This is a flag for package development. You are welcome to run the example. taxa <- c( "Homo sapiens", "Ursus arctos", "Pan paniscus", "Tyrannosaurus rex", "Ginkgo biloba", "Vulcan", "Klingon" ) results <- tree_from_taxonomy(taxa) print(results$unresolved) # The taxa that do not match ape::plot.phylo(results$phy) # may generate warnings due to problems with singletons ape::plot.phylo(ape::collapse.singles(results$phy), show.node.label = TRUE) # got rid of singles, but this also removes a lot of the node.labels ## End(Not run) # end dontrun
Get node numbers, node names, descendant tip numbers and labels of nodes from any tree, and node ages from dated trees.
tree_get_node_data( tree = NULL, nodes = NULL, node_data = c("node_number", "node_label", "node_age", "descendant_tips_number", "descendant_tips_label") )
tree_get_node_data( tree = NULL, nodes = NULL, node_data = c("node_number", "node_label", "node_age", "descendant_tips_number", "descendant_tips_label") )
tree |
A tree either as a newick character string or as a |
nodes |
Numeric vector with node numbers from which you want to obtain data. Default to NULL: obtain data for all nodes in the tree. |
node_data |
A character vector containing one or all from: "node_number", "node_label", "node_age", "descendant_tips_number", "descendant_tips_label" |
A list
Identify the presence of a single lineage outgroup in a phylogeny
tree_get_singleton_outgroup(tree = NULL)
tree_get_singleton_outgroup(tree = NULL)
tree |
A tree either as a newick character string or as a |
A character vector with the name of the single lineage outgroup.
Returns NA
if there is none.
To get tip numbers descending from any given node of a tree
tree_node_tips(tree = NULL, node = NULL, curr = NULL)
tree_node_tips(tree = NULL, node = NULL, curr = NULL)
tree |
a phylogenetic tree as an object of class |
node |
an integer specifying a node number in the tree. |
curr |
the set of previously stored node numbers - used in recursive function calls. |
A numeric vector with tip numbers descending from a node
Information on contributors, authors, study ids and clades from studies with chronograms in Open tree of Life
treebase_cache
treebase_cache
A list of five data sets
A dataframe with two elements: author names and number of studies in TreeBase authored by each
A dataframe with two elements: author names and study identifiers
Generated with make_treebase_cache()
TreeBASE database, no longer available online https://en.wikipedia.org/wiki/TreeBASE
This includes opentree chronograms, contributors, treebase and curators For speed, datelife caches chronograms and other information. Running this (within the checked out version of datelife) will refresh these. Then git commit and git push them back
update_all_cached()
update_all_cached()
None
The function calls get_opentree_chronograms()
to update the OpenTree
chronograms database cached in datelife. It has the option to write the updated
object as an .Rdata file, that will be independent of the opentree_chronograms
data object that you can load with data("opentree_chronograms", package = "datelife")
.
update_datelife_cache( write = TRUE, updated_name = "opentree_chronograms_updated", file_path = file.path(tempdir()), ... )
update_datelife_cache( write = TRUE, updated_name = "opentree_chronograms_updated", file_path = file.path(tempdir()), ... )
write |
Defaults to |
updated_name |
Used if |
file_path |
Used if |
... |
Arguments passed on to
|
A list of 4 elements:
A list of lists of author names of the original studies that published chronograms currently stored in the Open Tree of Life database.
A list of lists of curator names that uploaded chronograms to the Open Tree of Life database.
A list of study identifiers from original studies that published chronograms currently stored in the Open Tree of Life database.
A multiPhylo
object storing the chronograms from Open Tree of
Life database.
A character vector indicating the time when the database object was last updated.
A character vector indicating the datelife package version when the object was last updated.
use_all_calibrations
generates one or multiple chronograms
(i.e., phylogenetic trees with branch lengths proportional to time) by dating
a tree topology given in phy
, and secondary calibrations given in
calibrations
, using the algorithm specified in the argument dating_method
.
use_all_calibrations( phy = NULL, calibrations = NULL, each = FALSE, dating_method = "bladj", ... )
use_all_calibrations( phy = NULL, calibrations = NULL, each = FALSE, dating_method = "bladj", ... )
phy |
A |
calibrations |
A |
each |
Boolean, default to |
dating_method |
Tree dating algorithm to use. Options are "bladj" or "pathd8" (Webb et al., 2008, doi:10.1093/bioinformatics/btn358; Britton et al., 2007, doi:10.1080/10635150701613783). |
... |
Arguments passed on to
|
If phy
has no branch lengths, dating_method
is ignores, and the function applies secondary
calibrations to date the tree with the BLADJ algorithm. See make_bladj_tree()
and use_calibrations_bladj()
.
If phy
has branch lengths, the function can use the PATHd8 algorithm. See use_calibrations_pathd8()
.
A phylo
or multiPhylo
object with branch lengths proportional to time.
The output object stores the used calibrations
and dating_method
as
attributes(output)$datelife_calibrations
and attributes(output)$dating_method
.
Webb, C. O., Ackerly, D. D., & Kembel, S. W. (2008). "Phylocom: software for the analysis of phylogenetic community structure and trait evolution". Bioinformatics, 24(18), doi:10.1093/bioinformatics/btn358.
Britton, T., Anderson, C. L., Jacquet, D., Lundqvist, S., & Bremer, K. (2007). "Estimating divergence times in large phylogenetic trees". Systematic biology, 56(5), 741-752. doi:10.1080/10635150701613783.
use_calibrations
combines all given calibrations and uses them as
constraints to perform a dating analysis on a given tree topology, using BLADJ
if it has no branch lengths, or PATHd8 if the given tree topology has initial
branch lengths.
use_calibrations( phy = NULL, calibrations = NULL, dating_method = "bladj", type = "median", ... )
use_calibrations( phy = NULL, calibrations = NULL, dating_method = "bladj", type = "median", ... )
phy |
A |
calibrations |
A |
dating_method |
Tree dating algorithm to use. Options are "bladj" or "pathd8" (Webb et al., 2008, doi:10.1093/bioinformatics/btn358; Britton et al., 2007, doi:10.1080/10635150701613783). |
type |
The type of age to use as calibration. Options are "median", "mean", "min", or "max". |
... |
Arguments passed on to
|
If phy
has no branch lengths, dating_method
is ignores, and the function applies secondary
calibrations to date the tree with the BLADJ algorithm. See make_bladj_tree()
and use_calibrations_bladj()
.
If phy
has branch lengths, the function can use the PATHd8 algorithm. See use_calibrations_pathd8()
.
A phylo
object with branch lengths proportional to time.
The output object stores the used calibrations
and dating_method
as
attributes(output)$datelife_calibrations
and attributes(output)$dating_method
.
The function use_calibrations_bladj
prepares the input for BLADJ
and calls make_bladj_tree()
.
use_calibrations_bladj(phy = NULL, calibrations, type = "median", root_age)
use_calibrations_bladj(phy = NULL, calibrations, type = "median", root_age)
phy |
A |
calibrations |
A |
type |
The type of age to use as calibration. Options are "median", "mean", "min", or "max". |
root_age |
Numeric specifying the age of the root. Only used if there are
no ages for the root node in |
The BLADJ algorithm is part of the Phylocom software, presented in Webb et al. (2008) doi:10.1093/bioinformatics/btn358.
A chronogram: a phylo
object with branch lengths proportional to time.
Webb, C. O., Ackerly, D. D., & Kembel, S. W. (2008). "Phylocom: software for the analysis of phylogenetic community structure and trait evolution". Bioinformatics, 24(18), doi:10.1093/bioinformatics/btn358.
The function prepares the input for BLADJ and calls make_bladj_tree()
use_calibrations_bladj.matchedCalibrations( calibrations, type = "mean", root_age = NULL )
use_calibrations_bladj.matchedCalibrations( calibrations, type = "mean", root_age = NULL )
calibrations |
A |
type |
The type of age to use as calibration. Options are "median", "mean", "min", or "max". |
root_age |
Numeric specifying an age for the root, provided by the user.
Only used if there are no time calibrations for the root node in the chronograms database.
If |
The BLADJ algorithm is part of the Phylocom software, presented in Webb et al. (2008) doi:10.1093/bioinformatics/btn358.
A phylo
object with branch lengths proportional to time.
Webb, C. O., Ackerly, D. D., & Kembel, S. W. (2008). "Phylocom: software for the analysis of phylogenetic community structure and trait evolution". Bioinformatics, 24(18), doi:10.1093/bioinformatics/btn358.
use_calibrations_each
wraps use_calibrations
to take each set of
given calibrations and use it independently as constraints for BLADJ or PATHd8
to date a given tree topology.
use_calibrations_each(phy = NULL, calibrations = NULL, ...)
use_calibrations_each(phy = NULL, calibrations = NULL, ...)
phy |
A |
calibrations |
A |
... |
Arguments passed on to
|
If phy
has no branch lengths, dating_method
is ignores, and the function applies secondary
calibrations to date the tree with the BLADJ algorithm. See make_bladj_tree()
and use_calibrations_bladj()
.
If phy
has branch lengths, the function can use the PATHd8 algorithm. See use_calibrations_pathd8()
.
A multiPhylo
object of trees with branch lengths proportional to time.
The output object stores the used calibrations
and dating_method
as
attributes(output)$datelife_calibrations
and attributes(output)$dating_method
.
use_calibrations_pathd8
uses secondary calibrations to date a tree with initial branch lengths using PATHd8.
use_calibrations_pathd8( phy = NULL, calibrations = NULL, expand = 0.1, giveup = 100 )
use_calibrations_pathd8( phy = NULL, calibrations = NULL, expand = 0.1, giveup = 100 )
phy |
A |
calibrations |
A |
expand |
How much to expand by each step to get consistent calibrations. Should be between 0 and 1. |
giveup |
How many expansions to try before giving up |
This function implements the PATHd8 algorithm
described in Britton et al. (2007) doi:10.1080/10635150701613783, with geiger::PATHd8.phylo()
.
The function first attempts to use the given calibrations as fixed ages.
If that fails (often due to conflict between calibrations), it will expand the
range of the minimum age and maximum age and try again. And repeat.
If expand = 0, it uses the summarized calibrations.
In some cases, it returns edge lengths in relative time (with maximum tree depth = 1)
instead of absolute time, as given by calibrations. In this case, the function returns NA.
This is an issue from PATHd8.
A phylo
object with branch lengths proportional to time.
Britton, T., Anderson, C. L., Jacquet, D., Lundqvist, S., & Bremer, K. (2007). "Estimating divergence times in large phylogenetic trees". Systematic biology, 56(5), 741-752. doi:10.1080/10635150701613783.
Date a tree with initial branch lengths with treePL.
use_calibrations_treePL(phy, calibrations)
use_calibrations_treePL(phy, calibrations)
phy |
A |
calibrations |
A |
This function uses treePL as described in Smith, S. A., & O’Meara, B. C. (2012).
doi:10.1093/bioinformatics/bts492, with
the function treePL.phylo
. It attempts to use the calibrations as fixed ages.
If that fails (often due to conflict between calibrations), it will expand the
range of the minimum age and maximum age and try again. And repeat.
If expand = 0, it uses the summarized calibrations.
In some cases, it returns edge lengths in relative time (with maximum tree depth = 1)
instead of absolute time, as given by calibrations. In this case, the function returns NA.
This is an issue from PATHd8.
A phylo object
Smith, S. A., & O’Meara, B. C. (2012). "treePL: divergence time estimation using penalized likelihood for large phylogenies". Bioinformatics, 28(20), 2689-2690, doi:10.1093/bioinformatics/bts492.