Phylogenetic Tools for Comparative Biology

As readers of this blog can probably imagine, I get a lot of emails inquiring on how to do this analysis or another. Sometimes these are using phytools, other times with geiger or other R phylogenetic comparative methods packages, and sometimes they even pertain to phylogeny inference (which I know a bit about, for sure – but have not actually ever really worked on). As often as I can I’ll do my best to respond, and sometimes I’ll even pass along a chapter or two from my recent book with Luke Harmon.

A common inquiry regards ancestral state reconstruction.

This is fair. I recently wrote a book chapter on this topic (which can be read in pre-print form here), so it might be reasonable to call me a bit of an “expert” (though a professed ‘ancestral state’ skeptic, at the same time).

In the book for ancestral state reconstruction of continuous traits we use phytools::fastAnc. Even though there are other implementations out there, fastAnc will still be a good option for many users & even provides the correct variances & 95% confidence intervals based on Rohlf (2001).

For discrete characters, on the other hand, we use the powerful R package corHMM, by Jeremy Beaulieu & al. (including James Boyko, whom I recently had the pleasure of co-teaching an R phylogenetic workshop with). corHMM works great; however, I have since also implemented marginal and joint ancestral state reconstruction for discrete characters in phytools via the new generic method ancr.

There’s no inherent reason to prefer phytools::ancr over corHMM – but if you read this blog, you’re probably already quite used to running code with phytools, so that’s one.

Unfortunately, we published the book before I’d implemented ancr, so it’s not covered there. Since I can’t point to the book for a simple demo of how ancestral state estimation of discrete traits works using phytools::ancr, I thought I might as well make one here!

(There is one published demo of phytools::ancr in my monographic software note for phytools 2.0 in the open-access journal PeerJ.)

Here I’m going to provide not a single tutorial, but a rapid-fire set of demos: first of marginal ASR for a single Mk transition model, then marginal ASR integrating over multiple models using model averaging, marginal ASR when some tips are not known with certainty, then marginal ASR extensions of the Mk model, such as the hidden-rates model, and finally joint reconstruction.

Pretty much all of these methods (except for model-averaging, but this could easily be done ‘manually’) are also implemented in corHMM.

Before we begin, let’s load phytools and its dependencies, which we’ll be using throughout this tutorial.

## load packages
library(phytools)

Marginal ancestral state reconstruction for a single Q matrix

For this first demo, I’ll use a dataset of squamate digit number based on Brandley et al. (2008). These data can be downloaded or read directly from the book website. Here I’ll do the latter, as follows.

squamate.tree<-read.nexus(
  file="http://www.phytools.org/Rbook/6/squamate.tre")
squamate.tree

## 
## Phylogenetic tree with 258 tips and 257 internal nodes.
## 
## Tip labels:
##   Abronia_graminea, Acontias_litoralis, Acontias_meleagris, Acontias_percivali, Acontophiops_lineatus, Acrochordus_granulatus, ...
## 
## Rooted; includes branch lengths.

squamate.data<-read.csv(
  file="http://www.phytools.org/Rbook/6/squamate-data.csv",
  row.names=1)
head(squamate.data)

##                            rear.toes
## Acontias_meleagris                 0
## Acontias_percivali                 0
## Alopoglossus_atriventris           5
## Alopoglossus_copii                 5
## Ameiva_ameiva                      5
## Amphiglossus_igneocaudatus         5

There are some mismatches between our tree and data, so we can start by fixing these (which I’ll do using the still handy geiger function name.check).

chk<-geiger::name.check(squamate.tree,
  squamate.data)
summary(chk)

## 139 taxa are present in the tree but not the data:
##     Abronia_graminea,
##     Acontias_litoralis,
##     Acontophiops_lineatus,
##     Acrochordus_granulatus,
##     Agamodon_anguliceps,
##     Agkistrodon_contortrix,
##     ....
## 1 taxon is present in the data but not the tree:
##     Trachyboa_boulengeri
## 
## To see complete list of mis-matched taxa, print object.

## prune tree
squamate.tree<-drop.tip(squamate.tree,
  chk$tree_not_data)
## subsample data
squamate.data<-squamate.data[squamate.tree$tip.label,,
  drop=FALSE]
## check again
geiger::name.check(squamate.tree,
  squamate.data)

## [1] "OK"

Let’s pull out the data vector of our trait of interest. We’ll always need to do this; however, for ancestral state reconstruction of a discrete character our input data can be a named factor, character, or integer vector (or numeric matrix, more on that below).

squamate.toes<-setNames(
  squamate.data$rear.toes,
  rownames(squamate.data))
head(squamate.toes)

##         Acontias_meleagris         Acontias_percivali   Alopoglossus_atriventris 
##                          0                          0                          5 
##         Alopoglossus_copii              Ameiva_ameiva Amphiglossus_igneocaudatus 
##                          5                          5                          5

Next, I’m going to hypothesize a trait evolution model for these data & fit it. This model is one in which digits of the rear foot are invariably gained or loss in a sequential fashion: than is, to evolve from having 4 digits in the hindfoot to only two, you must first pass through the state of having 3 hindfoot digits. I won’t pre-suppose that these rates are equal among each other and certainly not in the two (loss vs. gain) directions. This seems pretty logical as an approximation.

To fit this particular model to our data, however, I first need to build it using a design matrix of the same dimensions as my trait space (that is $k \times k$ for $k$ levels of the trait), but containing 0s and positive integers, depending on the types of state changes that we want to include in our model and those that we don’t.

ordered_model<-matrix(c(
  0,1,0,0,0,0,
  2,0,3,0,0,0,
  0,4,0,5,0,0,
  0,0,6,0,7,0,
  0,0,0,8,0,9,
  0,0,0,0,10,0),6,6,
  byrow=TRUE,
  dimnames=list(0:5,0:5))
ordered_model

##   0 1 2 3  4 5
## 0 0 1 0 0  0 0
## 1 2 0 3 0  0 0
## 2 0 4 0 5  0 0
## 3 0 0 6 0  7 0
## 4 0 0 0 8  0 9
## 5 0 0 0 0 10 0

Since this model contains quite a few parameters (ten, in fact – though it could be worse), let’s run multiple optimization iterations to try to ensure that we converge to the true Maximum Likelihood solution.

I’ll do this in parallel using the foreach package as I demonstrated earlier this month on the blog. (I’m also going to use try to make sure that if one of our optimization fails, we don’t lose all of them!)

library(foreach)
library(doParallel)
niter<-10 ## set iterations
## set ncores and open cluster
ncores<-min(niter,detectCores()-1)
mc<-makeCluster(ncores,type="PSOCK")
registerDoParallel(cl=mc)
all_fits<-foreach(i=1:niter)%dopar%{ 
  obj<-list()
  class(obj)<-"try-error"
  while(inherits(obj,"try-error")){
    obj<-try(phytools::fitMk(squamate.tree,
      squamate.toes,model=ordered_model,pi="fitzjohn",
      logscale=sample(c(FALSE,TRUE),1),
      opt.method=sample(c("nlminb","optim"),1),
      rand.start=TRUE))
  }
  obj
}
stopCluster(mc) ## stop cluster
lnL<-sapply(all_fits,logLik)
lnL

##  [1] -114.9672 -115.2785 -114.9672 -115.2785 -114.9672 -115.2785 -115.2785 -114.9672 -115.2785
## [10] -114.9672

Here’s our best-fitting ordered model.

squamate_fit<-all_fits[[which.max(lnL)]]
plot(squamate_fit,width=TRUE,color=TRUE,mar=rep(0.1,4),
  xlim=c(-1.8,1),ylim=c(-1.1,1.1),text=FALSE,
  show.zeros=FALSE,max.lwd=6)

plot of chunk unnamed-chunk-76

We shouldn’t worry too much about the very high transition rates between character levels 2 and 3. These just tell us that these two trait levels are probably virtually randomly distributed with respect to one another on the tree. That is to say, if a species has state 3 than it is just as likely that a very close relative will have state 2 as share state 3 and vice versa. Indeed, I predict that the same will be true of our estimated ancestral states.

With our fitted model in hand, we’re ready to run ancr as follows.

squamate_ancr<-ancr(squamate_fit)
squamate_ancr

## Marginal ancestral state estimates:
##     0 1 2 3        4        5
## 120 0 0 0 0 0.125952 0.874048
## 121 0 0 0 0 0.027224 0.972776
## 122 0 0 0 0 0.027248 0.972752
## 123 0 0 0 0 0.041905 0.958095
## 124 0 0 0 0 0.074453 0.925547
## 125 0 0 0 0 0.063773 0.936227
## ...
## 
## Log-likelihood = -114.967167

Very conveniently for us, the "ancr" object class also comes with an accompanying plot method. Here’s what that can look like.

cols<-hcl.colors(n=6)
plot(squamate_ancr,args.nodelabels=list(piecol=cols),
  args.tiplabels=list(piecol=cols),direction="upwards")

plot of chunk unnamed-chunk-78

Great. That’s not bad!

Model-averaged ancestral states

For this next demo, I’ll use a phylogeny of primates and diel activity pattern from Kirk & Kay (2004). The data & tree are packaged with phytools, so there’s no need to download the files from the web (though they’re available from our book website).

## load tree and data
data("primate.tree")
primate.tree

## 
## Phylogenetic tree with 90 tips and 89 internal nodes.
## 
## Tip labels:
##   Allenopithecus_nigroviridis, Cercopithecus_mitis, Cercopithecus_cephus, Cercopithecus_petaurista, Erythrocebus_patas, Chlorocebus_aethiops, ...
## 
## Rooted; includes branch lengths.

data("primate.data")
head(primate.data)

##                                  Group Skull_length Optic_foramen_area Orbit_area
## Allenopithecus_nigroviridis Anthropoid         98.5                7.0      298.7
## Alouatta_palliata           Anthropoid        109.8                5.3      382.3
## Alouatta_seniculus          Anthropoid        108.0                8.0      359.4
## Aotus_trivirgatus           Anthropoid         60.5                3.1      297.4
## Arctocebus_aureus            Prosimian         49.5                1.2      134.8
## Arctocebus_calabarensis      Prosimian         53.8                1.6      156.6
##                             Activity_pattern Activity_pattern_code
## Allenopithecus_nigroviridis          Diurnal                     0
## Alouatta_palliata                    Diurnal                     0
## Alouatta_seniculus                   Diurnal                     0
## Aotus_trivirgatus                  Nocturnal                     2
## Arctocebus_aureus                  Nocturnal                     2
## Arctocebus_calabarensis            Nocturnal                     2

Next, just as before, let’s pull out our data as a single named vector as follows.

activity<-setNames(primate.data$Activity_pattern,
  rownames(primate.data))
head(activity)

## Allenopithecus_nigroviridis           Alouatta_palliata          Alouatta_seniculus 
##                     Diurnal                     Diurnal                     Diurnal 
##           Aotus_trivirgatus           Arctocebus_aureus     Arctocebus_calabarensis 
##                   Nocturnal                   Nocturnal                   Nocturnal 
## Levels: Cathemeral Diurnal Nocturnal

Our three character states are "cathemeral" (active randomly during the day and night", "diurnal," and "nocturnal."

Now we’re ready to fit our models. Even though I might have also contemplated other custom models, as we did for the squamate digits previously, for simplicity here I’ll fit just a model with equal back & forth transition rates between each pair of states (the ER model), a model with symmetric rates (the SYM model), and a model that allows all transition rates between different states to assume different values (the ARD model), as follows.

primate_er<-fitMk(primate.tree,activity,
  model="ER",pi="fitzjohn")
primate_sym<-fitMk(primate.tree,activity,
  model="SYM",pi="fitzjohn")
primate_ard<-fitMk(primate.tree,activity,
  model="ARD",pi="fitzjohn")

After having fit each of these three models, let’s compare them using a generic anova call. Take note that I’m saving the output of anova to a new object called primate_aov.

primate_aov<-anova(primate_er,
  primate_sym,primate_ard)

##                log(L) d.f.      AIC    weight
## primate_er  -29.49184    1 60.98368 0.3348288
## primate_sym -27.69681    3 61.39363 0.2727746
## primate_ard -24.33319    6 60.66637 0.3923967

Our analysis reveals that the strength of evidence supporting each of these three different models is quite similar one to the other. As such, we might choose to use model averaging in our ancestral reconstruction. What this entails is just computing a set of weighted average ancestral states, in which the weights come from our Akaike weights. ancr will do this automatically, in fact, if we pass it our primate_aov object instead of any one of our original fitted models.

primate_ancr<-ancr(primate_aov)
primate_ancr

## Marginal ancestral state estimates:
##    Cathemeral  Diurnal Nocturnal
## 91   0.000018 0.011355  0.988626
## 92   0.002987 0.034054  0.962960
## 93   0.059849 0.881411  0.058740
## 94   0.010596 0.988241  0.001162
## 95   0.001634 0.998335  0.000030
## 96   0.000227 0.999767  0.000006
## ...
## 
## Log-likelihood = -27.106777

We can finish this demo by visualizing the results we obtained.

cols<-setNames(c("goldenrod","lightblue","black"),
  levels(activity))
node.cex<-apply(primate_ancr$ace,1,
  function(x) if(any(x>0.95)) 0.3 else 0.6)
plot(primate_ancr,
  args.plotTree=list(type="arc",arc_height=0.5,
    fsize=0.5,offset=3),
  args.nodelabels=list(cex=node.cex,piecol=cols),
  args.tiplabels=list(cex=0.2,piecol=cols),
  legend=FALSE)
legend(0.2*max(nodeHeights(primate.tree)),
  0.3*max(nodeHeights(primate.tree)),
  levels(activity),pch=16,col=cols,
  horiz=FALSE,cex=0.8,bty="n",pt.cex=2,
  y.intersp=1.2)

plot of chunk unnamed-chunk-85

Ancestral state reconstruction when tip values are uncertain

Something else that phytools can handle easily is ancestral state estimation when the tip values for some or many species are uncertain.

I’ve written about this extensively in the past, most comprehensively here.

In summary, we can either treat tip states as if the ambiguity was observed directly or as if we observe one condition for the character, but some probability exists that we might be incorrect. More details on the differences and logic underlying this distinction is in my previous post on this topic.

Let’s return to our primate example from before and imagine, for the purposes of this demo, that our diel activity behavioral data for primates is highly imperfect. Not only that, but it is asymmetrically imperfect: any species that is observed to be cathemeral (active randomly during the day and night) definitely is cathemeral; but species that are only observed as active during the day are probably diurnal, but may be cathemeral, with a similar logic applying to nocturnally active taxa.

Now, of course, plenty is known about primate diel activity pattern – so I’m in no way asserting this as a genuine attribute of our data; however, this property can & does often affect real interspecific datasets (like our data in urban tolerance in Anolis lizard from Winchell et al. 2020).

To start, I’ll just convert my factor vector activity to a binary matrix using phytools::to.matrix as follows.

Activity<-to.matrix(activity,levels(activity))
head(Activity)

##                             Cathemeral Diurnal Nocturnal
## Allenopithecus_nigroviridis          0       1         0
## Alouatta_palliata                    0       1         0
## Alouatta_seniculus                   0       1         0
## Aotus_trivirgatus                    0       0         1
## Arctocebus_aureus                    0       0         1
## Arctocebus_calabarensis              0       0         1

Next, for each species, I’ll (arbitrarily) determine that if the species is coded as Diurnal or Nocturnal it might actually be Cathemeral with, say, a 10% probability.

for(i in 1:nrow(Activity)){
  Activity[i,]<-if(Activity[i,"Diurnal"]==1) c(0.1,0.9,0) else
    if(Activity[i,"Nocturnal"]==1) c(0.1,0,0.9) else
      c(1,0,0)
}
head(Activity)

##                             Cathemeral Diurnal Nocturnal
## Allenopithecus_nigroviridis        0.1     0.9       0.0
## Alouatta_palliata                  0.1     0.9       0.0
## Alouatta_seniculus                 0.1     0.9       0.0
## Aotus_trivirgatus                  0.1     0.0       0.9
## Arctocebus_aureus                  0.1     0.0       0.9
## Arctocebus_calabarensis            0.1     0.0       0.9

This is what that looks like.

Now, just as before, we can run our models and estimate ancestral states.

primate_er2<-fitMk(primate.tree,Activity,
  model="ER",pi="fitzjohn")
primate_sym2<-fitMk(primate.tree,Activity,
  model="SYM",pi="fitzjohn")
primate_ard2<-fitMk(primate.tree,Activity,
  model="ARD",pi="fitzjohn")
primate_aov2<-anova(primate_er2,primate_sym2,
  primate_ard2)

##                 log(L) d.f.      AIC    weight
## primate_er2  -36.15873    1 74.31746 0.6472514
## primate_sym2 -35.32473    3 76.64947 0.2016893
## primate_ard2 -32.61379    6 77.22758 0.1510593

primate_ancr2<-ancr(primate_ard2)
primate_ancr2

## Marginal ancestral state estimates:
##    Cathemeral  Diurnal Nocturnal
## 91   0.000008 0.000537  0.999455
## 92   0.006294 0.014579  0.979127
## 93   0.156004 0.787791  0.056205
## 94   0.026345 0.972644  0.001011
## 95   0.003722 0.996256  0.000022
## 96   0.000495 0.999502  0.000004
## ...
## 
## Log-likelihood = -32.613791

That’s all there is to it!

Ancestral state reconstruction under the hidden-rates model

The phytools generic method ancr is designed to work not only with fitMk model objects, but also with many of the other discrete character evolution model object that phytools can produce. These include fitted models from fitgammaMk (for $\Gamma$ distributed rate heterogeneity among edges), fitfnMk (for an ordered discrete character evolution model where the rate of evolution between adjacent states varies as a polynomial functional form), fitpolyMk (for data with intraspecific polymorphism), and fitHRM (for the hidden-rates model).

To illustrate ancestral state reconstruction under just one of these, the hidden-rates model, I’m going to use a dataset for pollen grain number in angiosperms from Williams et al. (2014). This is the same dataset that we used in our book and I wrote a previous post here comparing the hidden-rates and $\Gamma$ models for these data. The Williams et al. (2014) can be downloaded from our book webpage, or read in directly from the web as I’ll do below.

First let’s get the data into R using read.csv.

pollen_data<-read.csv(
  file="http://www.phytools.org/Rbook/7/pollen-data.csv",
  row.names=1)
head(pollen_data)

##                        V1
## Nuphar_lutea            3
## Nuphar_pumila           2
## Nuphar_advena           2
## Austrobaileya_scandens  2
## Schisandra_propinqua    2
## Kadsura_heteroclita     2

Next the tree, of course.

pollen_tree<-read.tree(
  file="http://www.phytools.org/Rbook/7/pollen-tree.phy")
pollen_tree

## 
## Phylogenetic tree with 511 tips and 510 internal nodes.
## 
## Tip labels:
##   Nuphar_lutea, Nuphar_pumila, Nuphar_advena, Austrobaileya_scandens, Schisandra_propinqua, Kadsura_heteroclita, ...
## 
## Rooted; includes branch lengths.

As always, our next step must be to pull our discrete character out as a new vector with names corresponding to the tip labels of our tree.

pollen<-setNames(pollen_data$V1,
  rownames(pollen_data))

To fit my hidden-rates model, I’m going to use a model with just two hidden levels. In the book I believe that we followed Williams et al. & used a model with three levels. This would take a lot longer to run, so for demonstrative purposes only (and with no assertion that this is a good model for our data) I’ll stick to the two-level model here.

pollen_hrm<-fitHRM(pollen_tree,pollen,ncat=2,
  pi="fitzjohn",parallel=TRUE,niter=20)

## 
## This is the design matrix of the fitted model.
## Does it make sense?
## 
##    2 2* 3 3*
## 2  0  1 2  0
## 2* 3  0 0  4
## 3  5  0 0  6
## 3* 0  7 8  0
## 
## Opened cluster with 15 cores.
## Running optimization iterations in parallel.
## Please wait....

pollen_hrm

## Object of class "fitHRM".
## 
## Observed states: [ 2, 3 ]
## Number of rate categories per state: [ 2, 2 ]
## 
## Fitted (or set) value of Q:
##            2        2*         3        3*
## 2  -0.090800  0.002171  0.088629  0.000000
## 2*  0.002895 -0.002895  0.000000  0.000000
## 3   0.961639  0.000000 -0.999650  0.038012
## 3*  0.000000  0.000000  0.003129 -0.003129
## 
## Fitted (or set) value of pi:
##        2       2*        3       3* 
## 0.108009 0.785332 0.101532 0.005126 
## due to treating the root prior as (a) nuisance.
## 
## Log-likelihood: -201.944088 
## 
## Optimization method used was "nlminb"
## 
## R thinks it has found the ML solution.

Finally, obtaining marginal ancestral states is done in the typical way, as follows.

pollen_ancr<-ancr(pollen_hrm)
pollen_ancr

## Marginal ancestral state estimates:
##            2       2*        3       3*
## 512 0.018264 0.965556 0.016139 0.000041
## 513 0.635040 0.249619 0.061181 0.054160
## 514 0.655555 0.273008 0.057236 0.014201
## 515 0.017118 0.981453 0.001399 0.000030
## 516 0.061269 0.932561 0.005148 0.001021
## 517 0.251698 0.727173 0.021026 0.000103
## ...
## 
## Log-likelihood = -201.944088

cols<-c(hcl.colors(6)[c(1,2,4,5)],
  c("2","2*","3","3*"))
plot(pollen_ancr,
  args.plotTree=list(type="arc",arc_height=0.5),
  args.nodelabels=list(piecol=cols,cex=0.25),
  args.tiplabels=list(piecol=cols))

plot of chunk unnamed-chunk-96

Terrific! That was easy.

Joint ancestral state reconstruction

Marginal ancestral state reconstruction, which we’ve been doing so far, has the goal of computing the marginal scaled likelihoods that each node is in each of the observed (or hidden) states of a character, integrating of all states at all other nodes of the phylogeny. If we want to make explicit, probabilistic statements about particular nodes, then this is the ancestral state reconstruction method we should choose.

Joint ancestral state reconstruction, on the other hand, has a different aim. It involves identifying the set of node values at all nodes for which the probability of our observed data under the model (the likelihood) is the highest. Joint reconstruction is less useful for making explicit probabilistic statements about individual nodes, but is better if we want to make an observation about how our character is most likely to have evolved node by node in the tree.

To illustrate joint ancestral state reconstruction in phytools, I’ll use a dataset for bony fish obtained from Benun Sutton & Wilson (2019) but that modifies a phylogeny that was originally published by Betancur et al. (2017). These files are available from the book website, but are also packaged with phytools and thus most easily loaded directly as follows.

data(bonyfish.tree)
bonyfish.tree

## 
## Phylogenetic tree with 90 tips and 89 internal nodes.
## 
## Tip labels:
##   Xenomystus_nigri, Chirocentrus_dorab, Talismania_bifurcata, Alepocephalus_tenebrosus, Misgurnus_bipartitus, Opsariichthys_bidens, ...
## 
## Rooted; includes branch lengths.

data(bonyfish.data)
head(bonyfish.data)

##                          spawning_mode paternal_care
## Xenomystus_nigri                  pair          male
## Chirocentrus_dorab               group          none
## Talismania_bifurcata             group          none
## Alepocephalus_tenebrosus         group          none
## Misgurnus_bipartitus              pair          none
## Opsariichthys_bidens              pair          none

For our analysis, we’ll model and reconstruct the history of the character of ‘spawning mode’, which has two different values in our dataset: "pair" or "group".

spawning_mode<-setNames(
  bonyfish.data$spawning_mode,
  rownames(bonyfish.data))
head(spawning_mode)

##         Xenomystus_nigri       Chirocentrus_dorab     Talismania_bifurcata 
##                     pair                    group                    group 
## Alepocephalus_tenebrosus     Misgurnus_bipartitus     Opsariichthys_bidens 
##                    group                     pair                     pair 
## Levels: group pair

bonyfish_er<-fitMk(bonyfish.tree,spawning_mode,
  model="ER",pi="fitzjohn")
bonyfish_ard<-fitMk(bonyfish.tree,spawning_mode,
  model="ARD",pi="fitzjohn")
anova(bonyfish_er,bonyfish_ard)

##                 log(L) d.f.      AIC    weight
## bonyfish_er  -29.51819    1 61.03638 0.6726777
## bonyfish_ard -29.23851    2 62.47702 0.3273223

This tells us that, of these two models, the ER model is better-supported compared to its more parameter-rich counterpart. To undertake joint reconstruction, we just run the following code on our chosen fitted model.

bonyfish_joint<-ancr(bonyfish_er,type="joint")
bonyfish_joint

## Joint ancestral state estimates:
##    state
## 91  pair
## 92  pair
## 93  pair
## 94  pair
## 95 group
## 96  pair
## ...
## 
## Log-likelihood = -30.54223

Just for interest, let’s also perform marginal reconstruction under the same model, and compare our results.

bonyfish_marginal<-ancr(bonyfish_er,
  type="marginal")

par(mfrow=c(2,1))
obj<-plot(bonyfish_joint,direction="upwards",
  legend=FALSE)
legend("bottomright",levels(spawning_mode),
  pch=16,col=obj$piecol,bty="n",cex=0.8,
  pt.cex=1.5)
obj<-plot(bonyfish_marginal,direction="upwards",
  legend=FALSE)
legend("bottomright",levels(spawning_mode),
  pch=16,col=obj$piecol,bty="n",cex=0.8,
  pt.cex=1.5)

$plot of chunk unnamed-chunk-103$

Here, our values are largely concordant between these two different methods – but that needn’t be the case, and it’s not uncommon for the state at a specific node in the best joint reconstruction to not even be the condition with the highest marginal likelihood!

That’s all I have for these different demos. I hope it’s useful to readers of the blog and others who contact me with questions about this sort of thing.

Phylogenetic Tools for Comparative Biology

Pages

Saturday, April 27, 2024

A few useful demos on ancestral state reconstruction for discrete characters using phytools

Marginal ancestral state reconstruction for a single Q matrix

Model-averaged ancestral states

Ancestral state reconstruction when tip values are uncertain

Ancestral state reconstruction under the hidden-rates model

Joint ancestral state reconstruction