Paleobiology
Published by: The Paleontological Society
Paleobiology 32(4):578-601. 2006
doi: 10.1666/05070.1
Fitting and comparing models of phyletic evolution: random walks and beyond

Gene Hunt.
Department of Paleobiology, National Museum of Natural History, Smithsonian Institution, Washington, D.C. 20013-7012. hunte@si.edu
Abstract
For almost 30 years, paleontologists have analyzed evolutionary sequences in terms of simple null models, most commonly random walks. Despite this long history, there has been little discussion of how model parameters may be estimated from real paleontological data. In this paper, I outline a likelihood-based framework for fitting and comparing models of phyletic evolution. Because of its usefulness and historical importance, I focus on a general form of the random walk model. The long-term dynamics of this model depend on just two parameters: the mean (μstep) and variance (σ2step) of the distribution of evolutionary transitions (or “steps”). The value of μstep determines the directionality of a sequence, and σ2step governs its volatility. Simulations show that these two parameters can be inferred reliably from paleontological data regardless of how completely the evolving lineage is sampled.
In addition to random walk models, suitable modification of the likelihood function permits consideration of a wide range of alternative evolutionary models. Candidate evolutionary models may be compared on equal footing using information statistics such as the Akaike Information Criterion (AIC). Two extensions to this method are developed: modeling stasis as an evolutionary mode, and assessing the homogeneity of dynamics across multiple evolutionary sequences. Within this framework, I reanalyze two well-known published data sets: tooth measurements from the Eocene mammal Cantius, and shell shape in the planktonic foraminifera Contusotruncana. These analyses support previous interpretations about evolutionary mode in size and shape variables in Cantius, and confirm the significantly directional nature of shell shape evolution in Contusotruncana. In addition, this model-fitting approach leads to a further insight about the geographic structure of evolutionary change in this foraminiferan lineage.
Accepted: June 7, 2006
Literature Cited
1974. A new look at the statistical model identification. IEEE Transactions on Automatic Control 19:716–723. CrossRef
2000. Understanding the dynamics of trends within evolving lineages. Paleobiology 26:319–329. Abstract
,
, and
. 2000. Null hypothesis testing: problems, prevalence, and an alternative. Journal of Wildlife Management 64:912–923. CrossRef, CSA
1987. Random walk and the existence of evolutionary rates. Paleobiology 13:446–464. CSA
1988. Random walk and the biometrics of morphological characters. Evolutionary Biology 9:369–398.
1979. Principles of statistics. Dover, New York.
and
. 1999. Polar gigantism dictated by oxygen availability. Nature 399:114–115. CrossRef, CSA
and
. 1994. Rates of evolution in the dentition of early Eocene Cantius: comparison of size and shape. Paleobiology 20:506–522.
and
. 2002. Global Ordovician faunal transitions in the marine benthos: ultimate causes. Paleobiology 28:26–40. Abstract, CSA
,
, and
. 1998. Reconstructing ancestral character states: a critical reappraisal. Trends in Ecology and Evolution 13:361–366. CrossRef, CSA
1981. The effects of temperature change and domestication on the body size of Late Pleistocene to Holocene mammals of Israel. Paleobiology 7:101–114.
1992. Likelihood. Johns Hopkins University Press, Baltimore.
1985. Phylogenies and the comparative method. American Naturalist 125:1–15. CrossRef, CSA
,
,
,
, and
. 2005. Climatically driven macroevolutionary patterns in the size of marine diatoms over the Cenozoic. Proceedings of the National Academy of Sciences USA 102:8927–8932. CrossRef, PubMed
2003. Origination and extinction through the Phanerozoic: a new approach. Journal of Geology 111:125–148. CrossRef
2005. Pulsed origination and extinction in the marine realm. Paleobiology 31:6–20. Abstract
and
. 2000. Using the past to predict the present: confidence intervals for regression equations in phylogenetic comparative methods. American Naturalist 155:346–364. CrossRef, PubMed
1976. Paleontology and phylogeny: patterns of evolution at the species level. American Journal of Science 276:1–28. CSA
1993. Quantification and comparison of evolutionary rates. American Journal of Science 293-A:453–478. CSA
2002. The structure of evolutionary theory. Belknap Press of Harvard University Press, Cambridge.
and
. 1977. Punctuated equilibria: the tempo and mode of evolution reconsidered. Paleobiology 3:115–151.
1968. Morphology, palaeoecology and evolution of the genus Gryphaea in the British Lias. Philosophical Transactions of the Royal Society of London B 254:91–128. CrossRef
and
. 1996. Translating between microevolutionary process and macroevolutionary patterns: the correlation structure of interspecific data. Evolution 50:1404–1417. CrossRef, CSA
2004. Phenotypic variation in fossil samples: modeling the consequences of time-averaging. Paleobiology 30:426–443. Abstract
and
. 2006. Climate change, body size evolution, and Cope's Rule in deep-sea ostracodes. Proceedings of the National Academy of Sciences USA 103:1347–1352. CrossRef, PubMed
and
. 1989. Regression and time series model selection in small samples. Biometrika 76:297–307. CrossRef
and
. 2004. Model selection in ecology and evolution. Trends in Ecology and Evolution 19:101–108. CrossRef, PubMed
1999. Evolution in the test size of deep-sea benthic foraminifera during the past 120 m.y. Marine Micropaleontology 37:53–65. CrossRef, CSA
and
. 2005. Likelihood-based confidence intervals of relative fitness for a common experimental design. Canadian Journal of Fisheries and Aquatic Science 62:693–699. CrossRef
and
. 2001. The pace of modern life II: from rates of contemporary microevolution to pattern and process. Genetica 112-113:145–164. CrossRef, PubMed, CSA
and
. 1998. Differences between evolution of mean form and evolution of new morphotypes: an example from Late Cretaceous planktonic foraminifera. Paleobiology 24:49–63. CSA
1992. Fundamentals of biostatistical inference. Marcel Dekker, New York.
1990. The rate of morphological evolution in mammals from the standpoint of the neutral expectation. American Naturalist 136:727–741. CrossRef
and
. 1998. Genetics and analysis of quantitative traits. Sinauer, Sunderland, Mass.
1991. Punctuated anagenesis and the importance of stratigraphy to paleobiology. Paleobiology 17:167–188. CSA
,
, and
. 1983. Evidence for puncuated gradualism in the Late Neogene Globorotalia tumida lineage of planktonic foraminifera. Paleobiology 9:377–389. CSA
1994. Estimating the rate of phenotypic evolution from comparative data. American Naturalist 144:193–209. CrossRef, CSA
1999. Estimation of ancestral states of continuous characters: a computer simulation study. Systematic Biology 48:642–650. CrossRef
and
. 1997. Phylogenies and the comparative method: a general approach to incorporating phylogenetic information into the analysis of interspecific data. American Naturalist 149:646–667. CrossRef, CSA
and
. 2001. The relationship between dissolved oxygen concentration and maximum size in deep-sea turrid gastropods: an application of quantile regression. Marine Biology 139:681–685. CrossRef, CSA
1985. Distinguishing patterns of evolution from patterns of depostion. Journal of Paleontology 59:561–567. CSA
2002. Modelling the evolution of continuously varying characters on phylogenetic trees: the case of Hominid cranial capacity. Pp. 269–286 in N. MacLeod and P. L. Forey, eds. Morphology, shape and phylogeny. Taylor and Francis, London.
and
. 2004. Model selection and model averaging in phylogenetics: advantages of Akaike Information Criterion and Bayesian approaches over likelihood ratio tests. Systematic Biology 53:793–808. CrossRef, PubMed
1977. Stochastic models in evolutionary paleobiology. Pp. 59–78 in A. Hallam, ed. Patterns of evolution as illustrated by the fossil record. Elsevier, Amsterdam.
and
. 1981. Evolution of single characters in the Jurassic ammonite Kosmoceras. Paleobiology 7:200–215.
and
. 1974. Stochastic simulation and evolution of morphology—towards a nomothetic paleontology. Systematic Zoology 23:305–322. CrossRef
2001. The description and classification of evolutionary mode: a computational approach. Paleobiology 27:446–465. Abstract, CSA
2003. Analysis of rates of morphologic evolution. Annual Review of Ecology and Systematics 34:605–632. CrossRef
,
, and
. 1999. Anagenetic evolution, stratophenetic patterns, and random walk models. Paleobiology 25:41–57.
,
,
, and
. 2004. Abiotic forcing of plankton evolution in the Cenozoic. Science 303:207–210. CrossRef, PubMed
1991. Fractals, chaos, power laws. W. H. Freeman, New York.
and
. 2001a. Uncorrelated change produces the apparent dependence of evolutionary rate on interval. Paleobiology 27:429–445. Abstract
and
. 2001b. Why the null matters: statistical tests, random walks and evolution. Genetica 112– 113:105–125. CrossRef
,
, and
. 1995. Evolution of body size in the woodrat over the past 25,000 years of climate change. Science 270:2012–2014. CrossRef, CSA
and
. 1995. Biometry: the principles and practice of statistics in biological research, 3d ed. W. H. Freeman, New York.
1922. The use of Gryphaea in the correlation of the lower Lias. Geological Magazine 59:256–268. Appendix
Parameter Estimates for the General Random Walk Model
For the special case in which samples are evenly spaced in time and sampling error for trait means is the same for all samples, simple equations can be derived for the maximum-likelihood estimates for the parameters of the general random walk, μstep and σ2step.
In order to simplify notation, let x denote an evolutionary transition in trait X (x = XD − XA). If all samples have the same phenotypic variance (Vp) and sample size (n), the sampling variance for each transition will be equal to ε = 2Vp/n. Making these substitutions into equation (6) yields the log-likelihood of a single evolutionary transition, x
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Parameter Estimates for the Stasis Model
Assuming constant sampling error, we can derive simple equations for maximum likelihood estimators of the parameters of the stasis model: θ, the trait optimum, and ω, the variance around this optimum. Again, let X denote a trait value, and x refer to an evolutionary transition in this trait (x = XD − XA). As explained in the text, the expected evolutionary transition (x) is a function of the ancestral trait value, such that the mean step is θ − XA. The step variance is equal to ω + εX, where εX is the sampling variance of the descendant population (the sampling error of XA does not contribute because we are conditioning on the observed ancestral trait value). Assuming that population means are normally distributed around θ, the log-likelihood of x is obtained by substituting the appropriate mean and variance into equation (4):
![]() |
![]() |
Figure 1.
Three example step distributions (top) used to generate corresponding evolutionary sequences of 100 steps (bottom). When the mean of the step distribution is zero, increases and decreases are equally likely and the overall dynamics are nondirectional (A, C). Step distribution B has a positive mean, and therefore will tend to produce positively trended evolutionary sequences. With increasing step variance, evolutionary sequences are more volatile, with larger positive and negative excursions (compare C with A)
Figure 2.
Mean trait in an evolving lineage (black line), sampled at five points in time (gray open circles). Close-up shows trait evolution between the last two sampled populations. The true trait difference between these two samples is equal to the sum of all evolutionary steps (si) separating the sampled populations. Because of sampling error, the estimated trait means (+) will differ from the true means by an error term (e). The observed difference between two populations includes both evolutionary differences (Σsi) and sampling error (e)
Figure 3.
The sampling distribution of μstep when estimated according to the maximum-likelihood method outlined in the text. Shown are the results for five different values of μstep, corresponding to sequences (n = 20 evolutionary transitions) that are strongly directional (−0.1, +0.1), weakly directional (−0.01, +0.01) and nondirectional (μstep = 0). Dotted lines shows the mean of the 1000 replicates for each value of μstep. In all cases the mean estimated μstep is very close to the true generating value, indicating that the estimation procedure is unbiased
Figure 4.
The sampling distribution of σ2step when estimated according to the maximum-likelihood method outlined in the text. Shown are the results for five different values of σ2step, increasing in magnitude from the top (σ2step = 0.001) to the bottom panel (σ2step = 10). Dotted lines shows the mean of the 1000 replicates for each value of μstep (each sequence consisted of n = 20 evolutionary transitions). Means of the σ2step estimates tend to be very close to, but slightly less than, the true generating value, indicating a slight bias (see text for details)
Figure 5.
Boxplots showing sampling distribution of μstep (left) and σ2step (right) when estimated from evolutionary sequences of varying levels of completeness. In all cases, the true sequence had 10,000 steps; of these, 0.1% (10 steps), 1% (100 steps), 10% (1000 steps), and 100% (10,000 steps) were sampled at random and used to estimate μstep and σ2step. True values of both μstep and σ2step were 0.1 for all simulations (dotted lines). Boxes indicate the middle two quartiles of the estimates, with the median indicated by a vertical bar, and the total range by the horizontal lines extending from the boxes
Figure 6.
Analysis of tooth measurements in Cantius. A, Evolutionary sequences of a size-related trait (M1 length) and a shape trait (M1 L/W ratio). Dots indicate population means, with approximate 95% confidence intervals. Sequences were standardized by within-sample variance and shifted so the first sampled point has a mean of zero, as described in the text. Time scale is in Myr counting forward from the first sample. B, C, Log-likelihood surface for estimates of the parameters (μstep and σ2step) of the general random walk model. B, M1 length. C, M1 L/W. Cross (+) indicates position of the maximum-likelihood estimate, and thin contours indicate the decrease in log-likelihood from this optimuum. The thick contour outlines the 95% joint confidence region. Solutions corresponding to an unbiased random walk are indicate by the gray dotted line at μstep = 0
Table 1.
Reanalysis of Cantius data consisting of four size-related (lengths and widths of two molars) and nine shape-related measurements (length-to-width ratios, the X and Y shape coordinates for three cusps, and the hypocone angle). Shown are the number of samples in each sequence (N), the mean number of individuals measured per sample (n), followed by the maximum-likelihood parameter estimates for the general random walk and stasis models. AICC values and Akaike weights are given for three models: GRW (general random walk), URW (unbiased random walk), and stasis. Akaike weights for models with more than minimal support (>0.05) are in bold. Trait sequences were transformed prior to analysis by within-sample variation, converting the parameter estimates to a common scale (see text)
Table 2.
The evolution of shell conicity in Contusotruncana at two sites (DSDP 384, North Atlantic Ocean; DSDP 525, South Atlantic Ocean). Each row corresponds to a model fit to the two evolutionary sequences. Models either allowed for directional evolution (GRW, general random walk), or not (URW, unbiased random walk). In addition, the models differed in terms of the homogeneity of dynamics across the two sites. In columns 3 and 4, “same” indicates that the parameter in question (μstep or σ2step) was constrained to be equal at the two sites, and “diff” means that the parameter was different, i.e., estimated separately at each site. K indicates the number of parameters in the model, and ℓ is log-likelihood. Parameter estimates are listed for the two models that provide reasonably good fits to the observed data (as indicated by bold Akaike weights). Subscripts for parameters indicate locality names
Cited by
Online publication date: 1-Sep-2009.
Abstract & References : Full Text : PDF (766 KB) : Rights & Permissions
Online publication date: 1-Sep-2009.
Abstract & References : Full Text : PDF (230 KB) : Rights & Permissions
Online publication date: 22-Aug-2009.
CrossRef
Online publication date: 1-Jun-2009.
CrossRef
Online publication date: 1-May-2009.
Abstract & References : Full Text : PDF (243 KB) : Supplementary Materials : Rights & Permissions
Online publication date: 1-Apr-2009.
CrossRef
Online publication date: 1-Mar-2009.
Abstract & References : Full Text : PDF (1077 KB) : Rights & Permissions
Online publication date: 1-Mar-2009.
CrossRef
Online publication date: 1-Jan-2009.
Abstract & References : Full Text : PDF (157 KB) : Rights & Permissions
Online publication date: 21-Dec-2008.
CrossRef
Online publication date: 1-Dec-2008.
Abstract & References : Full Text : PDF (1061 KB) : Rights & Permissions
Online publication date: 26-Sep-2008.
CrossRef
Online publication date: 1-Sep-2008.
Abstract & References : Full Text : PDF (385 KB) : Rights & Permissions
Online publication date: 1-Jul-2008.
CrossRef
Online publication date: 8-May-2008.
CrossRef
Online publication date: 14-Mar-2008.
CrossRef
Online publication date: 1-Mar-2008.
Abstract & References : Full Text : PDF (1065 KB) : Rights & Permissions
Online publication date: 1-Mar-2008.
Abstract & References : Full Text : PDF (431 KB) : Supplementary Materials : Rights & Permissions
Online publication date: 20-Dec-2007.
CrossRef
Online publication date: 1-Jul-2007.
Abstract & References : Full Text : PDF (302 KB) : Rights & Permissions
Online publication date: 1-Jan-2007.
Abstract & References : Full Text : PDF (1211 KB) : Rights & Permissions















