搜档网
当前位置:搜档网 › 动物线粒体DNAPCR引物大全

动物线粒体DNAPCR引物大全

动物线粒体DNAPCR引物大全
动物线粒体DNAPCR引物大全

Incorporating Molecular Evolution into Phylogenetic Analysis,and a New Compilation of Conserved Polymerase Chain Reaction Primers for Animal Mitochondrial DNA

Chris Simon,1,2Thomas R.Buckley,3Francesco Frati,4James B.Stewart,5,6and Andrew T .Beckenbach 5

1

Ecology and Evolutionary Biology,University of Connecticut,Storrs,Connecticut 06269;email:chris.simon@https://www.sodocs.net/doc/1014878599.html,

2School of Biological Sciences,Victoria University of Wellington,Wellington 6014,New Zealand 3Landcare Research,Auckland 1142,New Zealand;email:BuckleyT@https://www.sodocs.net/doc/1014878599.html, 4Department of Evolutionary Biology,University of Siena,53100Siena,Italy;email:frati@unisi.it 5

Department of Molecular Biology and Biochemistry,Simon Fraser University,Burnaby,British Columbia V5A 1S6,Canada;email:jbs@alumni.sfu.ca,beckenba@sfu.ca

6

Department of Laboratory Medicine,Division of Metabolic Diseases,Karolinska Institutet,Norvum 14186,Stockholm,Sweden

Annu.Rev.Ecol.Evol.Syst.2006.37:545–79First published online as a Review in Advance on August 16,2006

The Annual Review of Ecology,Evolution,and Systematics is online at

https://www.sodocs.net/doc/1014878599.html,

This article’s doi:

10.1146/annurev.ecolsys.37.091305.110018Copyright c

2006by Annual Reviews.All rights reserved

1543-592X/06/1201-0545$20.00

Key Words

among-site rate variation,covarion-like evolution,molecular clocks,mtDNA genomes,nodal support,PCR primers

Abstract

DNA data has been widely used in animal phylogenetic studies over the past 15years.Here we review how these studies have used ad-vances in knowledge of molecular evolutionary processes to create more realistic models of evolution,evaluate the information content of data,test phylogenetic hypotheses,attach time to phylogenies,and understand the relative usefulness of mitochondrial and nuclear genes.We also provide a new compilation of conserved polymerase chain reaction (PCR)primers for mitochondrial genes that comple-ments our earlier compilation.

545

A n n u . R e v . E c o l . E v o l . S y s t . 2006.37:545-579. D o w n l o a d e d f r o m a r j o u r n a l s .a n n u a l r e v i e w s .o r g b y P r o f C h r i s S i m o n o n 11/10/06. F o r p e r s o n a l u s e o n l y .

mtDNA:mitochondrial DNA

Gene order

rearrangement:an

evolutionary change in the location and/or direction of transcription of a gene with respect to other genes

PCR:polymerase chain reaction

rRNA:ribosomal RNA

INTRODUCTION

The properties of the genes used and our ability to accommodate these properties have a much larger in?uence on the outcome of a molecular phylogenetic analysis than the particular method chosen to build a tree [although there are good reasons for preferring some phylogenetic methods over others (Holder &Lewis 2003;Swofford et al.1996,2001)].It is from a careful study of data and their properties that em-piricists can gain insight into the type of analyses needed (Simon 1991,Simon et al.1994).Here we update our previous review of the evolution,weighting,and phyloge-netic utility of mitochondrial genes and expand the focus from insects to all animals and from mitochondria to all DNA—although many of our examples still come from mitochondrial DNA (mtDNA).We summarize advances that have been made in the past 12years especially in the area of (a )accommodating rate variation among sites,among data partitions,and among lineages;(b )understanding the information content of data;and (c )taking advantage of the relative phylogenetic usefulness of mitochon-drial and nuclear genes.We do not include a section on the phylogenetic usefulness of different mtDNA genes because this has been updated for animals by others (e.g.,Lin &Danforth 2004,Meyer &Zardoya 2003).Similarly,useful reviews have appeared recently that focus on animal mitochondrial genome evolution (Boore et al.2005),cytonuclear coevolution (Burger et al.2003,Rand et al.2004),mechanisms of gene order rearrangement (Boore 2000),the use of mtDNA in phylogeographic/species-level studies (Funk &Omland 2003),and the population biology of mtDNA (Ballard &Rand 2005).Finally,we include as a web-resource an updated compilation of conserved mtDNA polymerase chain reaction (PCR)primers (see the Supplemental Appendix;follow the Supplemental Material link from the Annual Reviews home page at https://www.sodocs.net/doc/1014878599.html,/)using the standardized naming system of Simon et al.(1994).This compilation contains 70new primers that are useful for sequencing large sections of the mitochondrial genome.

The Beginnings of Molecular Systematics and the Rapid Pace of Change:The In?uence of Molecular Technology

In 2003,the world celebrated the 50th anniversary of the discovery of the structure of DNA.Since 1953,DNA sequences have been incorporated into every aspect of biology.The development of molecular technology and subsequent production of data have dictated the direction of molecular phylogenetics.Despite the advances introduced by chain termination sequencing (Sanger et al.1977),DNA sequencing was dif?cult and slow before the advent of PCR (Saiki et al.1985);molecular phylo-genetic analysis was therefore largely based on amino acid sequences,immunological distances,DNA-DNA hybridization,allozymes,and mitochondrial DNA restriction site mapping (reviewed in Simon 1991).Before the development of large batteries of conserved PCR primers for mitochondrial DNA (e.g.,Kocher et al.1989,Simon et al.1994),direct sequencing of RNA was easier than sequencing of DNA,and large data sets of 18S ribosomal RNA (rRNA)accumulated and grew for comparative pur-poses.Thus was set into motion the collection of a large amount of sequence data

546Simon et al.

A n n u . R e v . E c o l . E v o l . S y s t . 2006.37:545-579. D o w n l o a d e d f r o m a r j o u r n a l s .a n n u a l r e v i e w s .o r g b y P r o f C h r i s S i m o n o n 11/10/06. F o r p e r s o n a l u s e o n l y .

for a severely problematic—thus very interesting—macromolecule simply because of sequencing technology.In the early to mid-1990s,PCR primers and the development of fast and ef?cient automated sequencing machines greatly increased the rate of col-lection of DNA data.The sequencing of complete organelle and nuclear genomes has greatly facilitated the development of additional PCR primers and the selection of genes.Despite the promise of nuclear genes (Zhang &Hewitt 2003),mtDNA still remains the most used genome in animal phylogenetics for studies of mid-to late Cenozoic-age divergences because of its faster rate of evolution,ease of sequencing,paucity of visible recombination,and conserved gene content (Caterino et al.2000,Lin &Danforth 2004,Simon et al.1994).A greater understanding of mitochondrial evolution (Ballard &Rand 2005,Funk &Omland 2003)allows potential pitfalls in data interpretation to be recognized and avoided.

The Importance of Phylogenetic Computer Applications

The presence of user-friendly tree-building programs has heavily in?uenced the choice of phylogenetic methods made by most systematists.The value of model-based methods such as maximum likelihood (ML)became apparent as more was learned about the mechanisms of evolution of DNA sequences,and empiricists and theoreti-cians began to consider the necessity of accommodating the peculiarities of molecular evolution in increasingly realistic models (Swofford et al.1996;see Figure 1).In the 1980s and 1990s tree-building programs that implemented maximum likelihood fa-cilitated model-based analyses (e.g.,Felsenstein 1981;Swofford 1998,beta version in 1993;Yang 1997,beta version 1993).In 2001,the ?rst version of the program,Mr-Bayes,became available (Huelsenbeck &Ronquist 2001).Because these programs allow a strong focus on models of evolution,building realistic models and choosing among them are two of the most active areas of research in systematics today (Posada &Buckley 2004,Sullivan &Joyce 2005).User-friendly programs like Modeltest (Posada &Crandall 1998)have facilitated the selection of models.

In the precursor to this review (Simon et al.1994),we pointed out that likelihood and spectral-analysis methods “show great promise for phylogenetic analysis but are computationally intensive and currently work well only for a limited number of taxa.”For this reason,models of evolution were discussed in terms of distance corrections and parsimony weighting.Between 1996and 2001,as computers and algorithms picked up speed,likelihood became the method of choice.Bayesian phylogenetic analysis (Huelsenbeck et al.2001,Larget &Simon 1999,Yang &Rannala 1997)was rapidly embraced once MrBayes became available (Huelsenbeck &Ronquist 2001).The advantage of Bayesian analysis lies in its ability to reveal phylogenetic uncer-tainty in trees directly constructed using probabilistic models (Holder &Lewis 2003,Huelsenbeck &Imennov 2002).Leache &Reeder (2002)were the ?rst to compare parsimony,likelihood,and Bayesian phylogenetic analyses for a large mitochondrial data set and discuss the comparative advantages of these procedures.

T oday,nucleotide sequence data are accumulating faster than they can be analyzed.Better and better models of evolution are being developed.Still,it is not apparent whether current Bayesian tree-building and fast maximum likelihood (e.g.,Zwickl

https://www.sodocs.net/doc/1014878599.html, ?Phylogenetics from the Perspective of the Data 547

A n n u . R e v . E c o l . E v o l . S y s t . 2006.37:545-579. D o w n l o a d e d f r o m a r j o u r n a l s .a n n u a l r e v i e w s .o r g b y P r o f C h r i s S i m o n o n 11/10/06. F o r p e r s o n a l u s e o n l y .

548Simon et al.

A n n u . R e v . E c o l . E v o l . S y s t . 2006.37:545-579. D o w n l o a d e d f r o m a r j o u r n a l s .a n n u a l r e v i e w s .o r g b y P r o f C h r i s S i m o n o n 11/10/06. F o r p e r s o n a l u s e o n l y .

Molecular clock:the assumption that the rate of molecular substitutions is constant per unit time and can be used to date divergences

2006)programs that incorporate complex models can handle the very large numbers of taxa to be analyzed in future data sets.One major question is whether large numbers of genes and especially enormous numbers of ?nely sampled taxa (sequences)can rescue distance analyses that do not make full use of the character information in the data and nonmodel-based methods such as evenly weighted parsimony that ignore complex substitution patterns.The remainder of this review explores substitution patterns and their signi?cant effects on phylogenetic analyses and data interpretation.

HOW MOLECULES EVOLVE

Evolution and Weighting of Molecular Data

Our previous review (Simon et al.1994)describes how model-based corrections and analogous phylogenetic weighting schemes were devised to correct for the fact that nucleotide substitutions at single sites are obscured by later substitutions that can mislead phylogenetic and molecular clock analyses.Beginning with the Jukes-Cantor (1969)model,we traced the parallel development of weighting schemes and models of evolution that relax each of Jukes-Cantor’s unrealistic assumptions:(a )all bases are found in equal proportions within a sequence,(b )every base changes to every other base with equal probability,and (c )the rate of substitution at every site is the same.T o incorporate molecular realism,parsimony tree-building methods rely on (a )weighting (e.g.,Cunningham 1997),or (b )conversion to distances,correction with a model of evolution and conversion back to character data using a Hadamard trans-formation (Penny et al.1996).Because many proponents of parsimony insist on even weighting of bases (which makes the same unrealistic assumptions as the Jukes-Cantor model)and because complex weighting takes away one of parsimony’s greatest advan-tages relative to maximum likelihood and Bayesian analyses—speed—little research progress has been made in data weighting.So,the discussion below focuses on mod-els of evolution.Although some models will always ?t data better than others,data ←??????????????????????????????????????????????????????????????

Figure 1

The ?gure shows models of evolution from simplest (most unrealistic)at top to the most complex (most general)at bottom (modi?ed from Swofford et al.1996,their ?gure 11,and Page &Holmes 1998,their ?gure 5.14).α=transition rate;β=transversion rate.All models are symmetrical (probability of changing from base X to base Y is the same as changing from base Y to base X).HKY85is similar to the Felsenstein 1984(F84)model (formally described by Kishino &Hasegawa 1989)in that both allow for unequal base frequencies and unequal transition and transversion rates.The general time reversible (GTR)model was developed several times between 1984and 1990(Felsenstein 2004),but not implemented until 1993(e.g.,Swofford 1993)for computational reasons.Note that none of the models described above include an accommodation for among-site rate variation (ASRV)but this can be added by attaching a ,invariant sites correction,and/or by partitioning data.Standard ASRV

corrections all assume that the pattern of ASRV does not change over time;violation of this assumption is addressed by covarion-like models.Another factor not accommodated by the models shown is correlation among sites.Note also that although some of these models

accommodate nucleotide bias,this accommodation assumes that the bias is the same in all taxa.

https://www.sodocs.net/doc/1014878599.html, ?Phylogenetics from the Perspective of the Data 549

A n n u . R e v . E c o l . E v o l . S y s t . 2006.37:545-579. D o w n l o a d e d f r o m a r j o u r n a l s .a n n u a l r e v i e w s .o r g b y P r o f C h r i s S i m o n o n 11/10/06. F o r p e r s o n a l u s e o n l y .

Among-site rate variation (ASRV):a ubiquitous property of molecules

where different numbers of substitutions per unit time occur at different DNA or amino acid sites

Among-lineage rate variation (ALRV):a property of molecules where the number of

substitutions per unit time occurring at any given position varies among taxa Covarion-like evolution:nucleotide or amino acid substitutions whose pattern of ASRV varies across lineages;encompasses heterotachy,mosaic evolution,and

covarion/covariatide evolution

might not ?t any model very well (Bollback 2002).Still,Sullivan &Swofford (2001)convincingly argue that it is better to use a poorly ?tting model than no model at all.Posada &Buckley (2004)and Sullivan &Joyce (2005)discuss the need for models in science and phylogenetics,review the properties of molecular evolutionary models,and/or compare and contrast methods of model selection.Below we discuss models from the point of view of the data.

Nucleotide and Amino Acid Substitution Models

The models of evolution described in Figure 1show the progressive incorporation of realism from top to bottom including accommodation of biased base composition within a sequence and biased substitution patterns from one nucleotide to another.As stated in the legend,none of these models incorporate among-site rate variation (ASRV)per se,but ASRV can be added as an additional parameter(s).Accommodating ASRV has more of an effect on phylogenetic analyses than accommodating nucleotide and base substitutional biases within sequences (Sullivan &Swofford 2001,Yang et al.1994).Adding ASRV to amino acid evolution models has also resulted in substantial improvement in model ?t (e.g.,Susko et al.2003,contra Yang et al.1994).Among-lineage rate variation (ALRV;covarion-like evolution)is a more complex process to model and,as a result,models of evolution that address these processes are less well developed.Below we discuss among-site and ALRV and their effects on phylogenetic analysis.

Among-Site Rate Variation

History.Ideally,slowly evolving genes would be most useful for deep-level phyloge-netics whereas rapidly evolving genes would be necessary for reconstructing recent divergences.Unfortunately,most genes are not well characterized by a single average rate of evolution.With the unraveling of the genetic code between 1961and 1966,it became immediately obvious that substitutions at different codon and amino acid positions would be accepted at different rates owing to the degeneracy of ?rst and third positions.Shortly thereafter,evolutionary biologists began to examine how this ASRV might affect genetic distances among taxa and phylogenetic analyses based on them (e.g.,Fitch &Margoliash 1967).Simon et al.(1994)reviewed early studies that attempted to incorporate ASRV into models of evolution and weighting schemes.They also demonstrated how knowledge of molecular structure and function can help to understand the constraints that create rate variation among sites.Yang (1996)produced an exceptionally complete review of the discovery of ASRV and its incor-poration as both discrete rate classes and continuous distributions into usable models of DNA sequence evolution including methods for calculating the α-shape param-eter of the -distribution of rates across sites.It is now well established that even genes that are considered to be strongly conserved contain rapidly evolving sites (e.g.,Simon et al.1996),with the converse also being true.Therefore it is inadequate to characterize a gene as fast or slow and expect that categorization to hold across all sites.

550Simon et al.

A n n u . R e v . E c o l . E v o l . S y s t . 2006.37:545-579. D o w n l o a d e d f r o m a r j o u r n a l s .a n n u a l r e v i e w s .o r g b y P r o f C h r i s S i m o n o n 11/10/06. F o r p e r s o n a l u s e o n l y .

SSR:site-speci?c rates Topology:the branching order of a phylogenetic tree

Detrimental effects of ignoring ASRV .In his 1996review,Yang summarized the detrimental effects of ignoring ASRV:severe underestimation of genetic distances,incorrect estimation of transition-transversion rate ratios,and confounding of phy-logenetic tree-building algorithms.On the population level,Yang (1996)pointed out that the in?nite alleles model ignores ASRV ,which in turn invalidates T ajima’s D statistic for testing neutrality and causes the distribution of pairwise sequence dif-ferences to mimic patterns of population expansion.Revell et al.(2005)found that incorrect inferences of rapid cladogenesis early in a group’s history due to bias in tree shape was caused by model underparameterization,especially the omission or mises-timation of ASRV .Buckley et al.(2001a)and Buckley &Cunningham (2002)showed that ignoring ASRV has a strong effect on estimates of nonparametric bootstrap sup-port.Similarly,Lemmon &Moriarty (2004)found that when ASRV is ignored in simulations model misspeci?cation has a strong effect on Bayesian posterior proba-bility estimates of nodal support.Although Kumar et al.(1993)warned that correcting sequences that differ by less than 5%may result in overparameterization (trading an increased error variance for realism),Yang (1996)argued that this point of view is a misconception.Sullivan &Joyce (2005)present an extensive discussion of the impact of model misspeci?cation and overparameterization.

Among-site rate variation in amino acid sequences.Like nucleotides,amino acids show considerable variability in rate of substitution among sites.Although the earliest studies of ASRV focused on amino acids (e.g.,Fitch &Markowitz 1970),later phy-logenetic studies of amino acids ignored ASRV .Recently,there has been a rebirth of interest in ASRV and the bene?cial effects of its incorporation into models of amino acid evolution (e.g.,Susko et al.2003,2004).As noted above,ASRV is caused by structural and functional constraints.Mitochondria-speci?c models of evolution that use information on the probability of particular amino acid replacements (e.g.,Adachi

&Hasegawa 1996)and secondary structure (e.g.,Li ′o

&Goldman 2002)also improve phylogeny construction,especially for deeper-level phylogenetic studies where adap-tive shifts in molecules among lineages (see below)are more likely to have taken place.

Different methods of accommodating among-site rate variation vary in their effects.ASRV can be addressed by partitioning the data into different rate classes and assigning each rate class its own rate (site-speci?c rates or SSR models).How-ever,SSR models (where each site in a rate class is assumed to evolve at the same rate)give much lower branch-length estimates than models,invariant sites models,and partitioned- models (Figure 2).Although ignoring ASRV may not in all circum-stances affect topology,it is more likely to have an effect on nodal support (Buckley et al.2001a).Buckley &Cunningham (2002)evaluated the effect of different ASRV models using six real data sets for taxa whose relationships are supported strongly by other data.By examining the match of trees constructed from a variety of models to the well-supported trees,they found that SSR models and evenly weighted parsimony performed poorly in recovering the topology and produced lower branch supports than models that incorporated a gamma ( )or invariant sites (I)correction.

https://www.sodocs.net/doc/1014878599.html, ?Phylogenetics from the Perspective of the Data 551

A n n u . R e v . E c o l . E v o l . S y s t . 2006.37:545-579. D o w n l o a d e d f r o m a r j o u r n a l s .a n n u a l r e v i e w s .o r g b y P r o f C h r i s S i m o n o n 11/10/06. F o r p e r s o n a l u s e o n l y .

0.18

0.160.140.120.100.080.060.040.020.00GTR branch lengths

G T R v a r i a n t b r a n c h l e n g t h s

Figure 2

Maximum-likelihood estimates of branch lengths under a range of variants of the general time reversible (GTR)substitution model (colored symbols ).The diagonal line connecting the GTR points on the x-and y-axes illustrates the deviation of the ASRV-corrected branch lengths from the equal-rates estimates.Models of evolution are described in Figure 1.Subscripts

indicate number of partitions;e.g.,in the GTR +SSR 4model the data is partitioned into four SSR classes (?rst,second,and third positions of proteins plus transfer RNA sites).Redrawn with permission from Buckley et al.2001a.

APRV:among-partition rate variation

Mixture models:models in which data points are viewed as generated by one of a number of distributions,each contributing to the likelihood

Among-Partition Rate Variation

Although partitioning data does not work well via SSR models because of the usually unrealistic assumption that all sites within the partition evolve at the same rate,parti-tioning strategies that estimate the distribution of ASRV separately for each partition (SSR + )avoid this problem.Yang’s PAML program has allowed partitioned models for many years but worked slowly for large numbers of taxa.Recently MrBayes has introduced partitioned models for Bayesian Markov chain Monte Carlo (MCMC)analyses that work well for many https://www.sodocs.net/doc/1014878599.html,ing partitioned models on combined data with heterogeneous rates addresses many of the concerns about data combinability (e.g.,Barker &Lutzoni 2002,Buckley et al.2002,Bull et al.1993)—assuming that the same topology (history)underlies different data partitions.Independently mod-eling data partitions improves branch support and tree likelihood (Brandley et al.2005,Castoe et al.2004,Nylander et al.2004),but must be done with care because partitioning data without properly accommodating among-partition rate variation (APRV)and/or using branch-length priors that are too diffuse can seriously distort branch lengths (Marshall et al.2006).

Mixture Models

There are many ways any one data set can be partitioned,and the method by which this is done is usually arbitrary.Indeed,for many genes it is not always clear how

552Simon et al.

A n n u . R e v . E c o l . E v o l . S y s t . 2006.37:545-579. D o w n l o a d e d f r o m a r j o u r n a l s .a n n u a l r e v i e w s .o r g b y P r o f C h r i s S i m o n o n 11/10/06. F o r p e r s o n a l u s e o n l y .

rRNA secondary

structure:the pattern of helices and unpaired regions formed when a

single-stranded ribosomal RNA primary sequence folds and pairs with itself Covariation:two bases or two amino acids whose rates of evolution are correlated usually due to functional or structural constraints

best to partition data.For protein coding genes,?rst,second,and third positions are common partitions;however,within each of these categories there is considerable variation in functional constraint.At third positions,sites can be twofold,threefold,or fourfold degenerate with the fourfold degenerate sites displaying the most freedom to vary (although these sites may be somewhat constrained by codon usage bias).In addition,messenger RNA secondary structure,protein secondary/tertiary structure,and other functional considerations all in?uence the rate and pattern of evolution.In rRNA,partitioning stems (paired)versus loops (unpaired)regions is tempting,but there is heterogeneity within each of these partitions.Although some authors have partitioned rRNA data into stems versus loops (e.g.,Springer &Douzery 1996),this does not make sense in terms of partition variability because some stems are slowly evolving whereas others evolve rapidly (Figure 3).Furthermore,within a stem the pattern of variation may differ depending on the identity of the base;tertiary struc-ture and protein interactions also add complexity (Gutell et al.2002,Hickson et al.1996,Pagel &Meade 2004).A more informative method of accommodating rate and pattern variation among data subsets is to use mixture models that do not require a priori speci?cation of partitions.In mixture models,characters are viewed as having been generated by one of a number of distributions,and each of these distributions contributes to the likelihood during an analysis.The parameters of each model dis-tribution and the weights assigned to them are estimated from the data.In the end of the analysis,each character can be assigned a probability of membership in each model.Pagel &Meade (2004)developed a Bayesian mixture model and characterized pattern variation in published protein (EF1αand DDC)and 12S small subunit (SSU)mitochondrial rRNA data.For the rRNA data,their program converged on four rate matrices to describe the patterns of variation in the data but these matrices only par-tially corresponded to stems and loops.In fact,the sites weighted heaviest for one of the matrices were evenly divided between stems and loops.Similarly,for the protein coding genes,although each Q matrix specialized on a particular codon position,each matrix also provided the best ?t to some other codon https://www.sodocs.net/doc/1014878599.html,rtillot &Philippe (2004)developed a mixture model for amino acid sequence evolution.Their use of a Bayesian Dirichlet process prior allowed the association of each amino acid site to a given Q matrix to be determined during the https://www.sodocs.net/doc/1014878599.html,ing Bayes factors they demon-strated that the mixture model outperformed standard amino acid rate matrices.Mixture models do not solve all problems for ribosomal RNA.In rRNA there is clearly covariation among sites that is related to base-pairing in helices,long-distance base-pairing interactions across domains,and ribosomal-protein-rRNA interactions (e.g.,Hickson et al.1996).In fact,covariation analysis has been particularly impor-tant in devising elegant models of rRNA secondary structure that have now been completely veri?ed using experimental methods and X-ray crystallography (Gutell et al.2002).Accommodating correlation among sites in models of rRNA evolution is important for phylogenetic analysis (e.g.,Huelsenbeck &Nielsen 1999,Smith et al.2004)and needs further study.Finally,rRNA molecules often include large numbers of variable length indels that can cause alignment dif?culties and provide information that is dif?cult to objectively incorporate into phylogenetic reconstruction.In general,

https://www.sodocs.net/doc/1014878599.html, ?Phylogenetics from the Perspective of the Data 553

A n n u . R e v . E c o l . E v o l . S y s t . 2006.37:545-579. D o w n l o a d e d f r o m a r j o u r n a l s .a n n u a l r e v i e w s .o r g b y P r o f C h r i s S i m o n o n 11/10/06. F o r p e r s o n a l u s e o n l y .

554Simon et al.

A n n u . R e v . E c o l . E v o l . S y s t . 2006.37:545-579. D o w n l o a d e d f r o m a r j o u r n a l s .a n n u a l r e v i e w s .o r g b y P r o f C h r i s S i m o n o n 11/10/06. F o r p e r s o n a l u s e o n l y .

information in indels has been quanti?ed separately and added into phylogenetic anal-yses later (Kjer et al.2001,Lutzoni et al.2000).A growing number of models explicitly describe covariation between bases in RNA molecules (e.g.,Smith et al.2004),but these are only rarely employed in phylogenetic reconstructions (e.g.,Kjer 2004).

Among-Lineage Rate Variation

Covarions,covariatides,heterotachy.Fitch &Markowitz (1970)proposed that adaptive shifts in protein function over time would result in a change in the proba-bility of substitution of a particular amino acid site over time (across lineages).Based on earlier ideas by Margoliash &Smith (1965)and Fitch &Margoliash (1967),they proposed the covarion hypothesis,which speculated that a certain proportion of sites in a protein were not free to vary but could become variable if other sites assumed their function.This was later extended to nucleotides and called covariotide evolution (Fitch 1986).Miyamoto &Fitch (1995)reviewed the development of the covarion hypothesis and showed that the distribution of variable and invariant positions is different in seven mammal and seven plant sorbitol dehydrogenase amino acid se-quences and thus follows a covarion model.Miyamoto &Fitch (1995)and others (e.g.,Steel et al.2000)pointed out that a constant-sized covarion class is unrealistic because the number of variable positions can differ among lineages.Newer covarion models (Galtier 2001,T uf?ey &Steel 1998)do not require this restriction.

Because nucleotide functional/rate shifts occur in rRNA as well as in protein coding genes (Simon et al.1994,1996),a more general name for this phenomenon would be helpful.Philippe &Lopez (2001)coined the useful term heterotachy to describe positions that evolve at different rates in different lineages.Earlier this type of evolution had been called “independent and episodic”(Johannes &Berger 1993),“mosaic evolution”(Simon et al.1996),or “covarion-like evolution”(Lockhart et al.1998,Lopez et al.1999).Demonstrations that evolutionary rate of a given position is not always constant throughout time,apart from discussions of codon evolution,include those by Johannes &Berger (1993),Philippe et al.(1996),Simon et al.(1996),Lockhart et al.(1996,1998),Lopez et al.(1999),and Gaucher et al.(2001).

Functional shifts in molecules cause shifts in the pattern of ASRV and are expected over the course of long-term evolution.Lopez et al.(2002)studied more than 2000vertebrate cytochrome-b sequences from 32large monophyletic groups and found ←??????????????????????????????????????????????????????????????

Figure 3

Rates of variability of individual nucleotide positions contingent on nucleotide variabilities were based on 500sequences of species belonging to the eukaryotic crown taxa small subunit rRNA molecules,superimposed on the secondary structure of Saccharomyces cerevisiae .The most variable positions are in black,the most conserved in light blue,and invariable positions are in white.Sites containing a nucleotide in S.cerevisiae but vacant in more than 75%of sequences,which were not considered for the variability calculations,are indicated in orange.Areas that could not be aligned with con?dence are also indicated in orange.All rRNA

molecules have similar patterns of variability.Redrawn with permission from Yves Van de Peer (http://www.psb.ugent.be/rRNA/varmaps/Scer ssu.html).

https://www.sodocs.net/doc/1014878599.html, ?Phylogenetics from the Perspective of the Data 555

A n n u . R e v . E c o l . E v o l . S y s t . 2006.37:545-579. D o w n l o a d e d f r o m a r j o u r n a l s .a n n u a l r e v i e w s .o r g b y P r o f C h r i s S i m o n o n 11/10/06. F o r p e r s o n a l u s e o n l y .

Nonstationarity:any process characterized by statistical properties that vary over time;in

phylogenetics most often discussed with reference to nucleotide bias

that all variable positions are heterotachous but that,surprisingly,there was no obvious relationship between variability,function and three-dimensional structure.Gribaldo et al.(2003)found that not all sites in vertebrate hemoglobin show a relationship to structure and function;there is a class of sites that are constant (within groups)but different (among groups)and these CBD sites show “signatures of func-tional specialization.”Lockhart et al.(2000)showed that evolving distributions of variable sites alone provide support for deep-branching patterns in eubacterial phylo-genies.If these shifts exhibit convergent evolution,Lockhart and colleagues pointed out,the tree or parts of it could be artifactual (bad covarion evolution).Reassuringly,different genes recovered similar patterns of evolution (good covarion evolution).Misof et al.(2002)showed how the -distributed rate at particular sites in the insect mitochondrial 16S rRNA gene varied extensively among different insect orders,indicating that covariotide evolution operates in mtDNA at deep levels of divergence.Progress has been made in developing covarion-like models of sequence evolution.In addition to the T uf?ey &Steel (1998)and Galtier (2001)models cited above,Yang et al.(2000)created a model that allows selection constraints to change across the sequence.One drawback of this model is that the branches with different selection pressures must be speci?ed a priori.Guindon et al.(2004)created a model that not only allows selection to change across the sequence,but also allows selective constraints to change over time without a priori speci?cation.Recently,Huelsenbeck et al.(2006)applied a Dirichlet process prior in a fully Bayesian approach to model variation in nonsynonymous sites and allow selection to vary across a sequence.These models are close in spirit to the original covarion models in which an amino acid substitution at one position in a gene changes the selective constraints elsewhere.Nucleotide bias among lineages.Shifts in patterns of ASRV can cause changes in the nucleotide bias of taxa across the tree.This is because the nucleotide bias of an or-ganism’s genome is most evident at the most variable sites (Simon et al.1994).Earlier it had been shown that substitutional bias can seriously affect phylogenetic trees (e.g.,Lockhart et al.1992,Weisburg et al.1989).If patterns of nucleotide bias have changed over lineages,models that assume a stationary distribution of nucleotide bias among taxa can cause systematic error in phylogeny construction.LogDet-type models were designed to incorporate nucleotide bias nonstationarity (Steel 1994).Because ASRV and ALRV (covarion-like evolution)can occur simultaneously the LogDet model should be combined with an invariant sites model because the LogDet does not cor-rect for ASRV .Haddrath &Baker (2001)showed that ratite bird phylogenies based on complete mitochondrial genomes were consistent with traditional expectations only when they were corrected for variation in nucleotide bias among lineages.Similarly,using complete mitochondrial genomes corrected for nonstationarity,Paton et al.(2002)refute the controversial conclusions based on fossils/morphology that mod-ern birds are descended from shorebirds or passerines.Jermiin et al.(2004)review the literature on nucleotide nonstationarity,discuss methods to detect it and,using simulations,examine the effects of nucleotide nonstationarity on tree building.

Although the biasing effects of nucleotide nonstationarity are well known,they are often forgotten.For example,in a recent paper,Rokas et al.(2003)used data

556Simon et al.

A n n u . R e v . E c o l . E v o l . S y s t . 2006.37:545-579. D o w n l o a d e d f r o m a r j o u r n a l s .a n n u a l r e v i e w s .o r g b y P r o f C h r i s S i m o n o n 11/10/06. F o r p e r s o n a l u s e o n l y .

from eight complete yeast nuclear genomes to discover the minimum number of genes necessary to build a robust and well-supported phylogeny.They concluded that approximately 20genes (average length 1198nucleotides)were needed and that there were no discernable characteristics of these genes that “predicted the performance”of one over the other.There are several problems with these conclusions.First—trivial but often overlooked—it is not the number of genes that is important but the number of informative sites,and this differs among genes and at different depths in the tree.Second and more important,there are many characteristics of genes that can make them less useful for phylogenetic analysis.One is high levels of ASRV;another is high levels of ALRV .Collins et al.(2005)demonstrated that many of the genes that Rokas et al.(2003)employed contained nonstationary nucleotide frequencies.When these genes were excluded,it was concluded that on average,eight yeast genes were required to recover the underlying phylogeny.Whereas,as Collins and colleagues point out,genes do not come in two classes (stationary and nonstationary),the greater the variation in levels of base compositional bias among lineages,the more problematic the gene is likely to be for phylogenetic reconstruction.A recent paper by Hedtke et al.(2006)demonstrated that the Rokas et al.(2003)phylogeny also suffered from poor taxon sampling leading to long branch problems.

Codon models.Codon models,reviewed by Yang (2003),operate at the level of the codon as a unit.They are an advancement over amino acid substitution models that only consider the probability of changing from one particular amino acid to another.Codon models take into account the fact that a switch from one type of codon family (e.g.,twofold versus fourfold versus sixfold degeneracy)changes the probability of substitution of individual bases within the codon.Although earlier codon models assumed that the rate of synonymous substitution is constant among sites within genes (e.g.,Muse &Gaut 1994)newer codon models re?ect the fact that there is signi?cant variability of synonymous rates in the majority of genes.This is observed,for example,in complete mitochondrial genome sequences of 111animal taxa sampled from 10disparate clades (F .V .Mannino &S.V .Muse,in review).Because of their computational complexity,codon models are only rarely used for inferring tree topology (Ren &Yang 2005).

INTERPRETING TREES AND DATA SUPPORT

A thorough knowledge of the properties of molecular data and how they evolve can also aid in the design and intrepretation of phylogenetic studies.T axon sampling,measuring nodal support,testing alternative phylogenetic hypotheses and attaching time to phylogenies are four areas where understanding molecular evolution can improve the process.

Taxon Sampling

Theoretical and simulation studies have shown that trees can be very hard to re-construct when branch lengths are unequal and rates of change vary over the tree,

https://www.sodocs.net/doc/1014878599.html, ?Phylogenetics from the Perspective of the Data 557

A n n u . R e v . E c o l . E v o l . S y s t . 2006.37:545-579. D o w n l o a d e d f r o m a r j o u r n a l s .a n n u a l r e v i e w s .o r g b y P r o f C h r i s S i m o n o n 11/10/06. F o r p e r s o n a l u s e o n l y .

although this is partly dependent on the method of phylogenetic inference employed (e.g.,Felsenstein 1978,Swofford et al.2001).Adding taxa to a tree in order to break up long branches improves the accuracy of topology estimation by better recon-structing the history of character-state changes,an implicit step in parsimony and likelihood inference.The more nodes in a phylogenetic tree,the more information on character-state change.The result is that more densely sampled subsections of a tree tend to have longer total branch lengths (Venditti et al.2006).Another fac-tor that can reduce sampling bias is increased sequence length,which also leads to more accurate estimates of branch lengths and tree topology as long as the added nucleotides have properties similar to the original sample.The above observations have led to a debate on whether it is preferable to sample more characters or more taxa in order to increase phylogenetic accuracy (e.g.,Pollock et al.2002,Rosenberg &Kumar 2001,Zwickl &Hillis 2002).The extensive simulation studies of Zwickl &Hillis (2002)and Pollock et al.(2002)showed how adding taxa to a phylogenetic anal-ysis was more effective than adding more characters once a certain sequence length was reached (Hillis et al.2003).Maximum-likelihood simulations showed consistent improvement with the addition of taxa,probably owing to the improved estimation of model parameters (Pollock &Bruno 2000).However,taxon addition should be done strategically to avoid adding taxa that increase the depth of the tree or that create long branches,which can introduce further biases into the reconstruction (e.g.,Mitchell et al.2000,Poe 2003,Hedtke et al.2006).

The second debate surrounding the density of taxon sampling concerns the ef-?cacy of evenly weighted parsimony in reconstructing large trees with many taxa.Hillis (1996)suggested that densely sampled phylogenies were actually easier to re-construct accurately than were sparsely sampled phylogenies,even for simple models of evolution such as evenly weighted parsimony.He showed that highly variable rates of evolution and ASRV were much less of a problem for densely sampled trees than for trees of only a few taxa.Dense sampling of taxa decreases the number of superim-posed changes of characters that must be reconstructed along lineages and therefore decreases the reliance on accurate and complex models of evolutionary change.DeBry (2005)summarized the ensuing debate and conducted new simulations,which agreed that overall accuracy for parsimony increases with increased taxon sampling.Similarly,Salamin et al.(2005)showed that evenly weighted parsimony performed quite well in reconstructing large phylogenies.However,although it is inevitable that increased taxon sampling will help to reconstruct superimposed changes,the conclusions about the ef?cacy of evenly weighted parsimony may be less applicable to empirical se-quence data,which tend to ?t models less well than simulated data and have more ASRV and more nucleotide bias (e.g.,Holder 2001).Empirical data may also show variation in patterns of substitution among lineages as described above.

Measures of Nodal Support

Measures of nodal support are generally more satisfying than whole tree measures of information content because for most phylogenetic trees some clades are bet-ter supported than others.Measures of nodal support provide a useful summary

558Simon et al.

A n n u . R e v . E c o l . E v o l . S y s t . 2006.37:545-579. D o w n l o a d e d f r o m a r j o u r n a l s .a n n u a l r e v i e w s .o r g b y P r o f C h r i s S i m o n o n 11/10/06. F o r p e r s o n a l u s e o n l y .

of how well data support the relationships de?ned by a tree.Poorly supported re-lationships are of little use in evolutionary studies other than to illustrate where more data are needed before conclusions can be drawn.Of course,high support val-ues do not mean that a node is accurate,only that it is well supported by the data;model misspeci?cation and taxon sampling can mislead the analysis (e.g.,Hedtke et al.2006).

Currently,the nonparametric bootstrap (Felsenstein 1985)is still the most widely used method for assessing nodal support,despite a long-running debate as to the validity and interpretation of the bootstrap in phylogenetics (e.g.,Efron et al.1996,Felsenstein &Kishino 1993,Hillis &Bull 1993,Holmes 2005,Sanderson 1995).Per-haps the best interpretation is that the bootstrap quanti?es the sensitivity of a node to perturbations in the data (Holmes 2005).However,as commonly implemented,the bootstrap gives a biased estimate of accuracy (Hillis &Bull 1993,Holmes 2005),where accuracy is de?ned as the probability of obtaining a correct phylogenetic re-construction (Penny et al.1992).The reason for this bias is related to the complex geometry of tree space and site-pattern space (Efron et al.1996,Holmes 2005),which is described as follows.All possible site patterns (i.e.,sets of sites that show identical states across all taxa)for a given data set can be divided into regions,each one separated by a boundary.Each region has an optimal topology associated with it.An observed data set lies within the sampling distribution of what we can consider to be the truth.Because bootstrap replicates are generated from the observed data rather than the truth,the proportion of replicate data sets that lie within each region can become distorted,which in turn can bias the bootstrap (Sanderson &Wojciechowski 2000).More sophisticated bootstrap techniques are available to correct for this bias (e.g.,Efron et al.1996,Shimodaira 2002);unfortunately,these are rarely implemented to measure nodal support.

The well-known bias of the bootstrap has led researchers to seek other methods of estimating nodal support,and perhaps the most popular alternative is Bayesian posterior probability (e.g.,Larget &Simon 1999,Yang &Rannala 1997).The in-creasing reliance on posterior probabilities as measures of nodal support,as op-posed to the bootstrap,has initiated a debate as to the merits of the two ap-proaches (e.g.,Huelsenbeck &Rannala 2004,Suzuki et al.2002).This debate arose from early observations of Bayesian inference in phylogenetics that demonstrated a tendency for posterior probabilities to be more extreme than ML nonparamet-ric bootstrap proportions,although the two tended to be correlated.This observa-tion was made from both empirical (e.g.,Buckley et al.2002,Wilcox et al.2002)and simulated data (e.g.,Cummings et al.2003,Suzuki et al.2002,Wilcox et al.2002).Here we address the following questions:Why are bootstrap proportions and posterior probabilities different?Is this really a problem?If so what can be done about it?

Comparing posterior probabilities and bootstrap proportions is dif?cult because they represent fundamentally different quantities.A nodal posterior probability is the probability that a given node is found in the true tree,conditional on the ob-served data,and the model (including both the prior model and the likelihood model).Some researchers argue that posterior probabilities are superior to bootstrap

https://www.sodocs.net/doc/1014878599.html, ?Phylogenetics from the Perspective of the Data 559

A n n u . R e v . E c o l . E v o l . S y s t . 2006.37:545-579. D o w n l o a d e d f r o m a r j o u r n a l s .a n n u a l r e v i e w s .o r g b y P r o f C h r i s S i m o n o n 11/10/06. F o r p e r s o n a l u s e o n l y .

proportions because the former give a more direct measure of con?dence in a node (e.g.,Huelsenbeck &Rannala 2004).A bootstrap proportion is based on the concept of resampling,and its exact interpretation depends on how it was calculated,as dis-cussed above.Furthermore,posterior probabilities are calculated by assuming a prior distribution on all model parameters,including the branch lengths and topology,and these priors will in?uence the posterior in many cases (e.g.,Yang &Rannala 2005).This dependence on prior distributions also complicates the comparison of the two measures of support.Yang &Rannala (2005)demonstrated how some prior distri-butions on branch length can cause nodal posterior probabilities to become extreme.Lewis et al.(2005)also showed how posterior probabilities can be biased if a prior that excludes zero-length branches is applied to data generated from a topology that,in fact,includes polytomies,an observation ?rst made by Suzuki et al.(2002).Lewis et al.(2005)demonstrated that if a polytomy exists but is not accommodated in the prior,resolution of the polytomy will be arbitrary and the nodal support indicated by the posterior probability will appear unusually high compared to ML bootstraps.As with the problems noted by Yang &Rannala (2005),this can be circumvented by applying more appropriate prior distributions or by using reversible jump MCMC to permit internal branches not supported by the data to collapse to polytomies (Lewis et al.2005).

Another observation from simulated (Huelsenbeck &Rannala 2004,Lemmon &Moriarity 2004,Suzuki et al.2002)and empirical data (Buckley 2002,Waddell et al.2002)is how model misspeci?cation affects posterior probabilities relative to bootstrap proportions.The simulations by Huelsenbeck &Rannala (2004)and the empirical study of Buckley (2002)show how posterior probabilities respond in a more extreme fashion to model misspeci?cation than the bootstrap or bootstrap-based topology tests.This problem is likely to be exacerbated by branch-length het-erogeneity (Felsenstein and inverse-Felsenstein zone problems)and a high rate of change across the tree,both of which typify many mtDNA data sets.This prob-lem can obviously be recti?ed by implementing a more complex substitution model;however,there is no guarantee that the models as implemented in available soft-ware packages will be suf?cient.Because we have little knowledge of the good-ness of ?t between data and model in typical phylogenetic studies (although good-ness of ?t tests do exist),we have little idea of the seriousness of the problem of model misspeci?cation in current implementations of Bayesian phylogenetic infer-ence.Finally,failure of convergence of the MCMC algorithm is an underappreciated problem,especially for large data sets.Failure to diagnose a lack of convergence of the algorithm will lead to incorrect posterior probabilities (Huelsenbeck et al.2002).

Given these issues,which method is best for quantifying phylogenetic support?This is a dif?cult question to answer because it partly depends on one’s philosoph-ical approach to statistical inference.However,if the desired measure of support is the probability that a node is correct given the data set and the model,then the only way to calculate this is by Bayes’theorem.Some researchers have attempted to reconcile Bayesian and bootstrap approaches by merging multiple Bayesian analyses from bootstrapped data sets,the so-called Bayesian bootstrap (Douady et al.2003,

560Simon et al.

A n n u . R e v . E c o l . E v o l . S y s t . 2006.37:545-579. D o w n l o a d e d f r o m a r j o u r n a l s .a n n u a l r e v i e w s .o r g b y P r o f C h r i s S i m o n o n 11/10/06. F o r p e r s o n a l u s e o n l y .

Waddell et al.2002).However,the exact statistical interpretation of these values is not at all obvious,and this combination of distinct statistical paradigms has yet to be justi?ed.In terms of practical advice,a review of the current Bayesian phylogenetic literature indicates that much more emphasis needs to be placed on developing more realistic models,checking the effects of the priors,and monitoring the convergence of posterior distributions.

Tests of Topology

In many phylogenetic studies it is important to assess the information content of the data with respect to the entire tree relative to an alternative hypothesis in addition to understanding support for individual nodes.A variety of tests of topology have been designed to achieve this goal.The currently used tests of topology can be divided into two types:frequentist and Bayesian.The frequentist tests can in turn be divided into parametric tests and nonparametric tests.The most widely used parametric test is the Swofford-Olsen-Waddell-Hillis (SOWH)test (Swofford et al.1996),which uses parametric bootstrapping to simulate replicate data sets that are in turn used to obtain the null distribution.The SOWH test has been applied in a wide variety of studies ranging from comparative phylogeography (e.g.,Carstens et al.2005b)to deeper phylogenetics (e.g.,Rokas et al.2002).The nonparametric tests use the nonparametric bootstrap to generate replicates that are then used to construct the null distribution.The Kashino-Hasegawa (KH)test (Kishino &Hasegawa 1989)is a nonparametric test designed to compare pairs of topologies selected before a phylogenetic analysis and may become too liberal when the maximum-likelihood topology (selected a posteriori)is tested against another topology (Goldman et al.2000).The Shimodaira-Hasegawa (SH)test (Shimodaira &Hasegawa 1999)and the approximately-unbiased (AU)test (Shimodaira 2002)simultaneously compare sets of topologies and incorporate more complex bootstrap procedures to correct for the bias associated with multiple comparisons and inclusion of the maximum-likelihood topology.The study by Buckley et al.(2001b)demonstrates the effect of these assumptions on the SH test relative to the KH test.For these reasons,and because of the nature of the null hypotheses employed by the nonparametric tests,the SH and KH tests are generally more conservative than the parametric tests (e.g.,Aris-Brosou 2003,Buckley 2002,Goldman et al.2000).The more explicit reliance on models of evolution by the parametric tests makes them very powerful tests,yet they are also more susceptible to model misspeci?cation (e.g.,Buckley 2002,Huelsenbeck et al.1996,Shimodaira 2002).

Bayesian tests of topology (e.g.,Aris-Brosou 2003)are much less commonly im-plemented than the frequentist tests.The Bayesian tests generally rely on Bayes factors (Kass &Raftery 1995)to compare marginal likelihoods generated under two hypotheses corresponding to different topologies.The use of Bayes factors in testing topologies will likely receive much greater attention in the future (Huelsenbeck &Imennov 2002,Suchard et al.2005).One example of a Bayesian test of topology is that of Carstens et al.(2005a),who assessed whether posterior distributions of trees contained topologies consistent with a priori demographic hypotheses.

https://www.sodocs.net/doc/1014878599.html, ?Phylogenetics from the Perspective of the Data 561

A n n u . R e v . E c o l . E v o l . S y s t . 2006.37:545-579. D o w n l o a d e d f r o m a r j o u r n a l s .a n n u a l r e v i e w s .o r g b y P r o f C h r i s S i m o n o n 11/10/06. F o r p e r s o n a l u s e o n l y .

Attaching Time to Phylogenies

The past several years have seen a renewed interest in methods for attaching time estimates to phylogenetic trees (Arbogast et al.2002,Welch &Bromham 2005).Early attempts to estimate divergence times when the molecular clock was violated often involved removing taxa with aberrant rates of evolution (Hedges &Kumar 2003),decomposing the procedure into quartets of taxa that conformed to the molecular clock (e.g.,Rambaut &Bromham 1998),or ?tting local clocks to different regions of the phylogeny (Y oder &Yang 2000).More recently,the focus has shifted to ex-plicitly accounting for rate changes over the tree with a growing emphasis on using models to quantify the uncertainty.The ?rst attempts to correct for changing rates with time were the nonparametric and semiparametric rate-smoothing methods de-scribed by Sanderson (1997,2002).The methods based on explicit models,known as relaxed-clock models,used Bayesian estimation to obtain posterior distributions of node times (Aris-Brosou &Yang 2002,Thorne et al.1998).For example,the method of Kishino et al.(2001)assumes that the rate of evolution along a descendant branch is a random variable drawn from a log-normal distribution,whose mean is the rate of evolution of the parent branch.This approach has been modi?ed by other re-searchers who use different distributions in relaxed-clock models (e.g.,Aris-Brosou &Yang 2002).Given the expanding range of relaxed-clock models it is becoming increasingly important to justify the use of one model over another.However,model selection procedures are rarely applied to relaxed-clock models.Aris-Brosou &Yang (2002)?rst applied Bayesian model selection to different relaxed-clock models,and we expect these methods to be more commonly applied in the future,especially as differ-ent relaxed-clock models are incorporated into user-friendly software packages.We currently have little information as to how well DNA data sets ?t the various relaxed-clock models,although simulation and empirical studies show that these methods can be misleading when the relaxed-clock model deviates strongly from the actual process of changing rates over the tree (Welch et al.2005),as is likely to be true for compar-isons of closely related populations experiencing slightly deleterious mutations (Ho et al.2005).Another serious source of error in dating studies is the manner in which dates are calibrated.It is desirable to have as many calibration points as possible and new methods that improve the ability to incorporate uncertainty into fossil dates are an important step forward (Yang &Rannala 2006).

A further complication for dating divergences using mitochondrial DNA is model misspeci?cation compounded by the typically rapid rate of evolution even for diver-gences that are only a few million years old.Buckley et al.(2001b)observed large differences in branch-length estimates among substitution models for a group of New Zealand cicada genera that began to radiate 10Mya (Arensburger et al.2004).These results indicate that,even for divergences a few million years old,the sub-stitution model can be very important for obtaining reliable divergence times.Fur-thermore,if data are partitioned,then APRV must be properly accommodated and suitable branch-length priors employed (Marshall et al.2006).

Finally,when attaching estimates of divergence time to recent speciation events,the well-known discordance between species and gene divergence times must be

562Simon et al.

A n n u . R e v . E c o l . E v o l . S y s t . 2006.37:545-579. D o w n l o a d e d f r o m a r j o u r n a l s .a n n u a l r e v i e w s .o r g b y P r o f C h r i s S i m o n o n 11/10/06. F o r p e r s o n a l u s e o n l y .

taken into account.Coalescent theory predicts that mtDNA haplotype divergence dates will often predate the speciation event by a substantial amount if ancestral effec-tive population size was large (Edwards &Beerli 2000).Jennings &Edwards (2005)showed how up to 10loci were required before stable estimates of species divergence times were obtained for three closely related species of Australian ?nches.Although large numbers of loci are rarely available for most empirical studies,this uncertainty can be adequately captured if coalescent times are accounted for in divergence-time estimation (Edwards &Beerli 2000).Various methods have been described to accom-modate this potential bias (e.g.,Edwards &Beerli 2000);however,these methods and the relaxed-clock models have yet to be implemented in a single framework of divergence-time estimation.

THE PHYLOGENETIC USEFULNESS OF MITOCHONDRIAL DNA Mitochondrial Genes and Phylogeny

At the species level,mitochondrial gene data are by far the most widely used marker for assessing phylogenetic relationships.They offer some advantages over nuclear data for practical reasons (ease of actually obtaining the sequences,direct compar-isons across different studies,and higher levels of variability).In addition,mtDNA genes have faster coalescent times owing to the smaller effective population size of their haploid,maternally inherited genomes.Thus,through genetic drift,their gene trees achieve species-level reciprocal monophyly sooner after speciation than do gene trees generated from nuclear substitutions (Sunnucks 2000).The extensive use of mi-tochondrial genes in phylogenetic reconstructions (e.g.,Caterino et al.2000)has gen-erated an overwhelmingly greater amount of data for mitochondrial genes compared to nuclear genes for all taxa of Metazoa (e.g.,over 800complete,or nearly complete,mitochondrial genomes were available in GenBank as of July 2006).Historically,the mitochondrial genes most often used for phylogenetic purposes are co1,co2,ssu (small subunit)and lsu (large subunit)rRNA,cytb,and the control region (Caterino et al.2000,Meyer &Zardoya 2003),but most regions of the mitochondrial genome are similarly useful;Simon et al.(1994)discuss the relative usefulness of the various mitochondrial genes at different levels of divergence (see especially their table 1).

Comparison of Substitution Rates in Mitochondrial versus Nuclear Genes

It has long been known that mitochondrial genes evolve faster than the majority of genes encoded in the nuclear genome (Brown et al.1982).Although the number of nuclear genes is much greater,and the variance of the average nonsynonymous sub-stitution rates among them is reasonably expected to be much larger,synonymous substitutions of mitochondrial genes have been empirically estimated to accumulate 1.7–3.4times as fast as in the most rapidly evolving nuclear genes,and 4.5–9times as fast if one averages across all nuclear genes studied (Moriyama &Powell 1997).These estimates may be biased,however,because genes chosen for analysis are usually

https://www.sodocs.net/doc/1014878599.html, ?Phylogenetics from the Perspective of the Data 563

A n n u . R e v . E c o l . E v o l . S y s t . 2006.37:545-579. D o w n l o a d e d f r o m a r j o u r n a l s .a n n u a l r e v i e w s .o r g b y P r o f C h r i s S i m o n o n 11/10/06. F o r p e r s o n a l u s e o n l y .

the most conserved in order to facilitate primer design.Nuclear introns,although faster than nuclear coding regions,are still slower than mtDNA (Zhang &Hewitt 2003).Heterogeneity in substitution rate divergence between mitochondrial and nu-clear genes has been observed as a function of the genes and taxa studied (Lin &Danforth 2004).Faster rates of evolution in mitochondrial genes have been related to higher rates of transition mutations (Brown et al.1982)and stronger constraints in nuclear genes due to selection for codon usage (Moriyama &Powell 1997).Other factors believed to in?uence the rates of evolution of mitochondrial genes include thermal adaptation,mitochondrial-nuclear interactions,and infection with Wolbachia (Ballard &Rand 2005).The faster evolution of mitochondrial genes implies higher levels of multiple substitutions than nuclear genes,especially at synonymous sites (Goto &Kimura 2001,Overton &Rhoads 2004),with an obvious effect on levels of homoplasy when genes are used for phylogenetic inference (Lin &Danforth 2004).Another major difference between nuclear and mitochondrial genes that could in?u-ence phylogenetic inference is the greater nucleotide compositional bias of mtDNA,especially in insects where A +T bias can be very extreme (Simon et al.1994).As a further complication,each of the two strands of mtDNA may exhibit different patterns of base compositional bias,and these patterns can change because of gene rearrangements (Hassanin et al.2005).

Performance of Mitochondrial versus Nuclear Genes for Phylogeny

The combination of these factors affects phylogenetic inference and has led to many attempts to evaluate the differential performance of nuclear and mitochondrial genes in phylogenetic reconstructions.Mitochondrial and nuclear genes have often been found to differ signi?cantly in phylogenetic signal (Overton &Rhoads 2004),a pat-tern that is to be expected at shallow phylogenetic levels,where different genes may present truly different allelic histories (gene trees)owing to as yet incomplete sorting of ancestral mtDNA haplotype polymorphisms under drift.Nuclear genes are sus-pected to outperform mitochondrial genes in phylogenetic inference when the depth of the tree is such that the nuclear genes possess suf?cient variability (Lin &Danforth 2004),whereas more rapidly evolving mitochondrial genes will have experienced more multiple substitutions and associated homoplasy.Obviously,the faster-evolving mitochondrial genes provide more resolving power for the phylogeny of closely re-lated taxa and for phylogeographic and population genetic studies (Avise 2000,Zhang &Hewitt 2003),but are more problematic for resolving the deepest nodes of a phy-logenetic tree (distantly related taxa),because of the extreme compositional biases,the asymmetry of transformation-rate matrices,the higher amount of homoplasy,and the higher levels of ASRV (Lin &Danforth 2004,Springer et al.2001).This has led to the conclusion that mitochondrial genes should be used largely for the phylo-genetics of closely related taxa (Cenozoic divergences)and that they require highly parameterized models that correct for some of the best known evolutionary anoma-lies if they are to be used for the phylogenetic analysis of more ancient divergences (Lin &Danforth 2004).Nevertheless,as discussed earlier,rates of evolution are not the only important parameters;other features,such as the heterogeneity of rates of

564Simon et al.

A n n u . R e v . E c o l . E v o l . S y s t . 2006.37:545-579. D o w n l o a d e d f r o m a r j o u r n a l s .a n n u a l r e v i e w s .o r g b y P r o f C h r i s S i m o n o n 11/10/06. F o r p e r s o n a l u s e o n l y .

动物简笔画教案

《动物简笔画》教学设计 教学目标: 1.通过简单易掌握的简笔画,激发孩子们对绘画的兴趣;、 2.教会孩子们用简单流畅的线条和基本的几何图形来概括生活中常 见的小动物,培养绘画的基本功底; 3.通过贴近生活的简笔画图案,培养孩子们对生活的观察力,引导 他们于平凡的动物中发现生活的美好; 4.让孩子们能运用画笔表达自己的所感所想,启发培养孩子们的观 察能力、想象能力和创造能力。 教学重、难点; 1.学习简笔画小动物的技巧,准确把握形状和特点。 2.熟悉并掌握简笔画技巧,自己用简笔画画自己喜欢的小动物。 一、情境导课: 教师:“同学们,你们喜欢小动物吗?那老师说个谜语,你猜猜他是谁?(老师说谜语,学生猜,谜语:阔嘴巴,叫呱呱,游泳跳高本领大;不吃米。不吃瓜,专吃害虫保庄稼。耳朵像蒲扇,身子像小山,鼻子长又长,帮人把活干。)你还喜欢那些小动物?他们有什么特点呢?你能学一学他们的声音或走路的姿态吗?” “同学们说的真好!老师也带来了一个有关动物的小故事,想听吗?” (老师播放视频《小猫钓鱼》) 这个故事告诉我们什么道理?同学们说的很对。我们做任何事情都要专心,不能三心二意,这样做事情才能成功。 故事里有哪些小动物呢?(学生找出故事里的小动物)

“看!它们来了!”(教师展示小动物图片)你能说说它们的特征吗? 你们想不想画一画它们啊?今天就和老师一起学画动物简笔 (板书课题:动物简笔画) 二、学画简笔画 1、学画猫“猫鼬什么特征?”跟老师画猫。 ⑴教师示范画猫。 ⑵展示学生作品并作评价。 同学们,你们觉得怎样才能画好猫呢?(要抓住它的特征) 2、学生自主学画鱼、蝴蝶、蜻蜓。 ⑴请同学们根据动物的特征,仔细观察,把其他的小动物画出来吧。把几名学生上台画 ⑵教师巡视指导 ⑶展示作品并评价 小结:同学们,你觉得怎样才能画好简笔画呢?对了。每个小动物都有自己独特的形象特征,只要我们抓住这个特征就能把它画出来。自由创作 老师:“同学们,这些小动物之间还会发生什么故事呢?你想不想把他们放在一起,创作一幅有情境的简笔画啊?下面就开始吧。”学生在优美的音乐声中创作, 教师展示学生作品并评出最佳创意奖和优秀小画家奖。

50个动物简笔画(简笔画教程)

---------------------------------------------------------------最新资料推荐------------------------------------------------------ 50个动物简笔画(简笔画教程) 学习简笔画,可以从画点开始练习,逐步进入到画线的培养,继而过渡到形,即用简练的笔法通过线与线的结合来组成形。 (一)点的练习(二)线的练习线的种类有:横线、竖线、斜线、曲线、弧线、波浪线、折线等。 线的练习先以短横、竖线逐步过渡到长的横、竖线练习,再进入到多种线的综合练习。 (可参插点与线的结合画)(三)形的练习各种线与线的结合组成形。 用简练的笔法,将线来组成各种几何形,又用几何形与几何形构成各种常见的物体主要外形特征。 两条横线和两条竖线可组成正方形和长方形。 四条斜线可组成菱形。 三条直线可组成三角形。 两条横线和两条斜线可组成梯形、平行四边形。 一条横线和一条弧线组成半圆形。 两个半圆组成一个圆形。 长方形和梯形可组成房子。 长方形、梯形、圆形可组成娃娃。 菱形、三角形、圆形可组成花卉。 长方形、圆形、半圆形可组成汽车、拖拉机等。 1 / 3

圆形、半圆形、长方形、三角形可组成鸟、小鸡。 三角形、半圆形、波浪线组成的山、水、船。 三角形和半圆形组成的树。 圆形、椭圆形组成的树。 总之,这些形与形的组成可绘成各种人物、动物、花鸟、山水、房屋、树、水果、家具等简易美丽的画面。 演示简笔画的方法(一)归类法归类法即以一种形为主,表现多种物的外形特征。 如: 用三角形由浅入深地表现各种物体。 用长方形由浅入深地表现各种物体。 用圆形由浅入深地表现各种物体。 演示简笔画的方法(二)分解法当物体外形线条比较复杂时,为了便于孩子学画,可以分作几步进行,然后再将分解图合并成一个完整的形象。 1 .一样物体的分解组合法 2.几种物体的分解组合法 3.带有情景的分解组合法(三)借用法借用人们熟悉的形象特征和几何图形来比喻另一物体。 启发人们进行形象思维和联想,这是简笔画的特点之一。 1 .形象比喻法 2.几何图形比喻法 A、一种形的比喻法 B、几种形的组合比喻法图例

【免费下载】22种小动物的简笔画图片大全

22种小动物的简笔画图片大全 22种小动物的简笔画图片大全. 搜集整理了22种动物简笔画图片,每一种动物都有很多张不同形态的简笔画,方便幼儿学画简笔画。卡通青蛙简笔画图片大全卡通小企鹅简笔画图片大全乌龟简笔画图片大全动物简笔画海象、管路敷设技术通过管线敷设技术,不仅可以解决吊顶层配置不规范问题,而且可保障各类管路习题到位。在管路敷设过程中,要加强看护关于管路高中资料试卷连接管口处理高中资料试卷弯扁度固定盒位置保护层防腐跨接地线弯曲半径标高等,要求技术交底。管线敷设技术中包含线槽、管架等多项方式,为解决高中语文电气课件中管壁薄、接口不严等问题,合理利用管线敷设技术。线缆敷设原则:在分线盒处,当不同电压回路交叉时,应采用金属隔板进行隔开处理;同一线槽内,强电回路须同时切断习题电源,线缆敷设完毕,要进行检查和检测处理。、电气课件中调试对全部高中资料试卷电气设备,在安装过程中以及安装结束后进行高中资料试卷调整试验;通电检查所有设备高中资料试卷相互作用与相互关系,根据生产工艺高中资料试卷要求,对电气设备进行空载与带负荷下高中资料试卷调控试验;对设备进行调整使其在正常工况下与过度工作下都可以正常工作;对于继电保护进行整核对定值,审核与校对图纸,编写复杂设备与装置高中资料试卷调试方案,编写重要设备高中资料试卷试验方案以及系统启动方案;对整套启动过程中高中资料试卷电气设备进行调试工作并且进行过关运行高中资料试卷技术指导。对于调试过程中高中资料试卷技术问题,作为调试人员,需要在事前掌握图纸资料、设备制造厂家出具高中资料试卷试验报告与相关技术资料,并且了解现场设备高中资料试卷布置情况与有关高中资料试卷电气系统接线等情况,然后根据规范与规程规定,制定设备调试高中资料试卷方案。 、电气设备调试高中资料试卷技术电力保护装置调试技术,电力保护高中资料试卷配置技术是指机组在进行继电保护高中资料试卷总体配置时,需要在最大限度内来确保机组高中资料试卷安全,并且尽可能地缩小故障高中资料试卷破坏范围,或者对某些异常高中资料试卷工况进行自动处理,尤其要避免错误高中资料试卷保护装置动作,并且拒绝动作,来避免不必要高中资料试卷突然停机。因此,电力高中资料试卷保护装置调试技术,要求电力保护装置做到准确灵活。对于差动保护装置高中资料试卷调试技术是指发电机一变压器组在发生内部故障时,需要进行外部电源高中资料试卷切除从而采用高中资料试卷主要保护装置。

动物简笔画课教案

简笔画课教案 教学目标: 1.通过简单易掌握的简笔画,激发孩子们对绘画的兴趣; 2.教会孩子们用简单流畅的线条和基本的几何图形来概括生活中常见的小动物,培养绘画 的基本功底; 3.通过贴近生活的简笔画图案,培养孩子们对生活的观察力,引导他们于平凡的景物中发 现生活的美好; 4.让孩子们能运用画笔表达自己的所感所想,启发培养孩子们的观察能力、想象能力和创 造能力。 教学重点: 1、学习简笔画小动物的技巧,准确把握形状和特点。 2、熟悉并掌握简笔画技巧,自己用简笔画画自己喜欢的小动物。 教学准备: 图画本、铅笔、PPT 教学方法: 问答法、演示法、练习法 教学步骤: (一)教学引入 师:今天我要教大家画简笔画,不过我们要画的是什么呢 师:好,上课! 生:老师好! 师:同学们好,请坐。老师想问大家一个问题,你们都喜欢小动物吗

生:喜欢! 师:那你们都知道有哪些小动物呢 学生举手回答:小鸟、金鱼、青蛙、狗狗、小白兔…… 师:Okay,大家都说的很好。 (二)教学展示 师:那接下来,我们玩一个猜猜看的游戏。大家根据提示来猜猜他们是哪种小动物吧(PPT展示问题并显示图片答案)猜猜我是谁(5分钟) 1.一位游泳家,说话呱呱呱,小时有尾没有脚,大时有脚没尾巴。(青蛙) 2.小飞机,纱翅膀,飞来飞去灭虫忙,低飞雨,高飞睛,气象预报它内行。 3.大眼睛,阔嘴巴,尾巴要比身体大。绿水青草衬着它,好象一朵大红花。 4.嘴像小铲子,脚像小扇子,走路左右摆,水上划船子。(小鸭子)。 5.年纪并不大,胡子一大把,不论遇见谁,总爱喊妈妈。(山羊) 6. 头戴大红花,身穿什锦衣,好象当家人,一早催人起。(公鸡) 7.名字叫做牛,不会拉犁头,说它力气小,背着房子走。(蜗牛) 8.身穿黑缎袍,尾巴象剪刀,冬天去南方,春天又来到(燕子) (三)教学练习 师:接下来,大家就和我一起来画这些可爱的小动物吧。(分发纸、笔、水彩笔)师:首先,咱们先画比较简单的金鱼吧: 1:画一个椭圆形作为金鱼的身子。2:在椭圆的前端画个三角形缺口作为鱼嘴。3:勾勒出鱼头,再画上鱼的眼睛。4:在鱼的后面画上金鱼的尾巴,在下面画两个半弧形作为鱼腹鳍。5:在在背部画上背鳍,这样一只金鱼就活灵活现了。师:大家画好了吗那么画蜻蜓也很简单哦。1:画个椭圆形作为蜻蜓的身体。2:

相关主题