--- title: "Loading phylogenetic trees into R" author: "Martin R. Smith " date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Loading phylogenetic trees into R} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- If a tree is available in a text-based format, it should not be too difficult to load into R. ## Manual entry A straightforward, if cumbersome, way to enter trees into R is by manual text entry: ```{r text-entry, fig.align='center', fig.width=6, out.width='75%'} myTree <- ape::read.tree(text = "((A, B), ((C, D), (E, F)));") plot(myTree) ``` Beware common gotchas when entering text: - Ensure that the tree string ends in a semi-colon. - Make sure that brackets are matched. Some text editors, such as [RStudio](https://posit.co/) or [Notepad++](https://notepad-plus-plus.org/), contain built-in syntax highlighting that will highlight the counterpart to each parenthesis as you type. - Look out for ((double brackets)) or brackets enclosing a single node: the resultant 'invisible nodes' can cause R to crash. ```{r double-bracket, fig.align='center', fig.width=6, out.width='75%'} badTree <- ape::read.tree(text = "((A, B), (((C, D), ((E), F))));") plot(badTree) ape::nodelabels(bg = c(3, 3, 2, 2, 3, 3, 2)) ``` The text is expected in Newick format, so can contain edge lengths - but edge lengths must be included for every edge of the tree, or they will be ignored. ```{r branch-lengths, fig.align='center', fig.width=6, out.width='75%'} myTree <- ape::read.tree( text = "((A:1, B:1):2, ((C:1, D:1):2, (E:1, F:1):2):4);" ) plot(myTree) ``` ## From a file First off, make sure that you are comfortable [telling R where to find a file](filesystem-navigation.html). ### Nexus files You can load trees from a nexus file using: ```r filename <- "my_file_name.nex" ape::read.nexus(filename) ``` If the file contains multiple trees, this will return a list of all trees in the file, with the class `multiPhylo`. If there is a single tree, then this will be returned as a tree of class `phylo` unless the option `force.multiPhylo = TRUE`. This can be useful when an unknown number of trees are to be processed in bulk, for example by ```r lapply(ape::read.nexus(filename, force.multiPhylo = TRUE), ape::consensus) ``` ### Newick files Newick trees can be read with `ape::read.tree(filename)`; `phytools::read.newick(filename)` does not report an error when a 'node' has a single descendant (which will often denote a misspecified tree). ### Trees from TNT Trees saved using [TNT](https://www.lillo.org.ar/phylogeny/tnt/) can be opened in R using `ReadTntTree()`. Trees should be saved in parenthetical format (TNT command `tsav*`), rather than TNT's compressed format (TNT command `tsav`). The TNT command `taxname=` will write taxon names to file, which results in larger but easier to read files. Trees in such a file will be read using the terminal names saved within the file. The TNT command `taxname-` saves just the numbers of the terminals. In order for the trees to be reunited with the names of their tips, terminal labels will be read from the linked matrix file listed in the first line of the .tre file. Ensure that this file exists in the expected location. If it does not, use `ReadTntTree(treeFile, relativePath = '../path/to/tipNameFile')`, or `tipLabels` to manually specify the names of the tips (e.g. `ReadTntTree(treeFile, tipLabels = c('outgroup', letters[1:8])`). ### Other formats Are there other formats that should be listed here? Please let me know by [opening an issue](https://github.com/ms609/TreeSearch/issues/new?title=Loading%20trees). ## Export (write) trees To save a tree to a text file, use `ape::write.tree(tree, file='filename.txt')` for Newick format (widely supported by most phylogenetic software), or `ape::write.nexus(tree, file='filename.nex')` for Nexus format. ## What next? You might want to: - [Load phylogenetic data](load-data.html) into R. - Conduct parsimony search using Brazeau, Guillerme & Smith's [approach to inapplicable data](https://ms609.github.io/TreeSearch/articles/tree-search.html), or using [Profile parsimony](https://ms609.github.io/TreeSearch/articles/profile.html). - [Calculate distances](https://ms609.github.io/TreeDist/articles/Using-TreeDist.html) between pairs of trees.