J Mol Evol (2006) 63:120–126
DOI: 10.1007/s00239-005-0255-4
Thermal Adaptation of the Small Subunit Ribosomal RNA Gene:
A Comparative Study
Huai-Chun Wang,1,2 Xuhua Xia,2 Donal Hickey3
1
2
3
Department of Mathematics and Statistics, Dalhousie University, Halifax, Nova Scotia B3H 3J5, Canada
Department of Biology, University of Ottawa, Ottawa, Ontario K1N 6N5, Canada
Department of Biology, Concordia University, Montreal, Quebec H3G 1M8, Canada
Received: 28 October 2005 / Accepted: 1 March 2006 [Reviewing Editor: Dr. Nicolas Galtier]
Abstract. We carried out a comprehensive survey of
small subunit ribosomal RNA sequences from archaeal, bacterial, and eukaryotic lineages in order to
understand the general patterns of thermal adaptation
in the rRNA genes. Within each lineage, we compared
sequences from mesophilic, moderately thermophilic,
and hyperthermophilic species. We carried out a more
detailed study of the archaea, because of the wide range
of growth temperatures within this group. Our results
confirmed that there is a clear correlation between the
GC content of the paired stem regions of the 16S rRNA
genes and the optimal growth temperature, and we
show that this correlation cannot be explained simply
by phylogenetic relatedness among the thermophilic
archaeal species. In addition, we found a significant,
positive relationship between rRNA stem length and
growth temperature. These correlations are found in
both bacterial and archaeal rRNA genes. Finally, we
compared rRNA sequences from warm-blooded and
cold-blooded vertebrates. We found that, while rRNA
sequences from the warm-blooded vertebrates have a
higher overall GC content than those from the coldblooded vertebrates, this difference is not concentrated
in the paired regions of the molecule, suggesting that
thermal adaptation is not the cause of the nucleotide
differences between the vertebrate lineages.
Key words: Small subunit ribosomal RNA —
Secondary structure — Phylogenetic independent
Correspondence to: Donal Hickey; email: dhickey@alcor.concordia.
ca
contrast — GC content — Optimal growth temperature
Introduction
Previous studies have shown that the GC content of
16S ribosomal RNAs is highly correlated with the
optimal growth temperature in prokaryotes (Dalgaard and Garrett 1993; Galtier and Lobry 1997;
Hurst and Merchant 2001; Wang and Hickey 2002;
Nakashima et al. 2003). The increased GC content is
concentrated in the stem regions of the molecule,
supporting the conclusion that it is the result of
natural selection acting to increase the structural
stability of the rRNA molecule. In this study, we
postulated that structural stability could be increased,
not only by a higher GC content in stem regions but
also by an increase in the stem length in thermophilic
species, because both would increase the number of
hydrogen bonds in the stem. Thus we predicted that
both GC content and stem length would be increased
among the thermophiles.
Another question that is interesting to evolutionary biologists is how quickly the rRNA genes in
prokaryotic species can evolve in response to temperature differences. Because of the lack of a universal molecular clock in prokaryotes, we addressed
the question by comparing the GC content and
length of stems and loops between mesophilic and
thermophilic species within the same genus.
121
We were also concerned about the lack of sample
independence which is inherent in cross-species comparisons that simply regress a given phenotypic character on an environmental variable. Since each species
is part of a hierarchically structured phylogeny, they
cannot be regarded as independent samples from the
same distribution (Felsenstein 1985; Harvey and Pagel
1991). Therefore, in this study we controlled for phylogenetic relatedness among the archaeal species by
using phylogeny-based comparative methods.
Our main focus was on archaeal species, because
this group is characterized by the greatest diversity of
optimal growth temperatures. In addition, however,
we did separate analyses of both bacterial and vertebrate small subunit rRNAs. This allowed us to assess whether the patterns of nucleotide content and
stem length that we observed when comparing mesophilic and thermophilic archaea were repeated in
the other kingdoms of life.
Materials and Methods
Sequence Data
The 16S rRNA sequence data were downloaded from the SSU
rRNA database http://oberon.fvms.ugent.be:8080/rRNA/ssu/index.html (Wuyts et al. 2002). For prokaryotes, we retrieved 10,566
bacterial and 590 archaeal 16S rRNA sequences. For each rRNA
sequence the GC content and length were calculated. In addition to
calculating the total GC content and sequence length for each
rRNA, we also calculated these values separately for the stem and
loop regions, as defined structurally in the SSU rRNA database.
Since many species have multiple entries in the 16S rRNA sequence database, the data were filtered by retaining only the first
entry for each species, which left rRNA data for a total of 4598
species. (When all sequences for a single species were retained and
the mean values for each species were used, the subsequent analyses
gave very similar results as the above method to filter multiple
entries of same species.) 18S rRNA sequences of 84 vertebrate
species were also collected, of which 38 species are warm-blooded
animals including 34 birds and 4 mammals, and 36 species are coldblooded animals including 23 fish, 18 amphibians, and 5 reptiles.
Growth Temperature
Optimal growth temperatures of 9168 bacterial and archaebacterial
species were kindly provided by Dr. Manfred Kracht of German
Collection of Microorganisms and Cell Cultures (http://
www.dsmz.de). Many species are listed more than once in the
database, sometimes with different growth temperatures. For
example, there are 101 temperature entries for Bacillus sp., 82 of
which have a temperature of 30C. In other cases the temperatures
are all the same or similar. For simplicity the temperature of first
entry for each species was assigned for that species, which leaves
temperature data for 3630 unique species. We also calculated
average temperature for the same species if there were multiple
entries. In the case of Bacillus sp. for instance, the average temperature is 34.49C. The correlation between the two temperature
data sets is 0.98 (p < 0.00001). There is no marked difference in
subsequent analyses when using the two temperature data sets. The
temperature data were then merged with the rRNA data, and 1673
prokaryotic species that have both temperature and rRNA data
(GC content and gene length) were obtained. Of these, 1573 species
are bacterial and the other 100 are archaeal.
Statistical Analyses
We used the nonparametric Wilcoxon rank sum tests to compare
average GC content and length between two temperature groups
such as mesophiles and thermophiles in bacteria. The KruskalWallis rank sum tests were used to compare the data among three
temperature groups: mesophilic, moderately thermophilic, and
hyperthermophilic bacteria.
For the archaeal species we obtained phylogenetically independent contrasts for temperature and GC content and, also, for temperature and sequence length. These contrasts were calculated using
the CONTRAST program (Felsenstein 1985) implemented in the
PHYLIP program package, version 3.6 (beta release; http://evolution.genetics.washington.edu/phylip.html). The phylogenetic relationships are based on two recently published archaeal trees
(Brochier et al. 2004). One of these trees was constructed from the
concatenated sequences of transcription proteins (such as RNA
polymerases) of 20 archaeal species whose complete or near-complete genomes have been sequenced. The other tree was derived from
a concatenation of ribosomal proteins of 18 archaeal species for
which complete genome sequences are available. The reason that
protein trees rather than an rRNA tree were used here is twofold.
First, it is known that phylogeny based on rRNA can be biased when
the GC content of the rRNAs is very different (e.g., Hasegawa and
Hashimoto 1993; Foster and Hickey 1999). Second, since we are
studying the composition of rRNA sequences, it is necessary to use a
phylogeny that is not based on these sequences themselves.
16S rRNA sequences for the 20 archaeal species were extracted
from the SSU rRNA database (Wuyts et al. 2002) or GenBank if
they were not available in the rRNA database. For Methanosarcina
acetivorans, only a partial sequence of 16S rRNA (GenBank locus
MESRR16SA; 1426 bp) was retrieved from GenBank. Three copies
of 16S rRNA gene were annotated (MA0896, MA1618, MA4614)
in the complete genome sequence M. acetivorans (Galagan JE et al.
2002) and they are also partial (1427 bp each).
Results
This study consisted of a number of complementary
analyses of 16S and 18S rRNA sequences. First, we
did a broad survey of the average trends in mesophilic,
thermophilic, and hyperthermophilic species of
prokaryotes. Second, we performed a more detailed
analysis of the archaeal sequences, with particular
emphasis on the potential confounding effects of
phylogenetic relatedness. Third, we looked for adaptive divergence between sequences from closely related species of bacteria, in an attempt to estimate the
evolutionary time that is required to achieve adaptive
change in these sequences. Finally, we examined
rRNA sequences from warm-blooded and coldblooded vertebrate species in order to see if the same
adaptive patterns could be seen in these lineages.
Parallel Trends in the Nucleotide Composition of
Archaea and Bacteria
For this analysis, species were grouped according to
their optimal growth temperatures. We defined the
122
Table 1. The average GC content and sequence length, ± standard error, of 16S rRNA stems and loops for mesophilic (<40C),
moderately thermophilic (40–75C), and hyperthermophilic (‡75C) bacteria and archaea
Stem regions
Mesophiles
(A) Nucleotide composition
Bacteria (n = 1573)
Archaea (n = 100)
(B) Sequence length
Bacteria (n = 1573)
Archaea (n = 100)
Thermophiles
Loop regions
Hyperthermophiles
Mesophiles
Thermophiles
Hyperthermophiles
66.1 ± 0.11
67.4 ± 0.26
70.1 ± 0.37
70.8 ± 1.62
76.5 ± 1.55
81.0 ± 0.41
40.1 ± 0.04
38.97 ± 0.21
41.6 ± 0.17
39.2 ± 0.57
41.2 ± 0.35
42.1 ± 0.25
885.2 ± 0.52
881.3 ± 1.83
894.9 ± 2.88
891.1 ± 4.55
922.2 ± 6.87
903.2 ± 3.03
637.3 ± 0.39
590.2 ± 1.17
640.6 ± 2.44
590.9 ± 2.38
634.3 ± 4.14
589.4 ± 2.01
following three groups: the mesophile group contained species with optimal growth temperatures less
than 40C, in which 1461 species are bacterial and 61
species are archaeal; the thermophile group included
those species with growth temperatures between 40
and 75C, in which 106 species are bacterial and 8
species are archaeal; and the hyperthermophile group
contained all species with a growth temperature
higher than 75C, with 6 species being bacterial and
31 species archaeal. The boundaries of these groupings are arbitrary; they were defined to create three
major ‘‘bins’’ that spanned the entire range of growth
temperatures and that contained several species in
each group. For each group, the archaeal and bacterial species were analyzed separately.
Rather than simply calculating the GC content of
the entire rRNA sequences for each group, we partitioned the sequences into the double-stranded stem
regions and the single-stranded loop regions. The
results are shown in Table 1A. From the table, we
can see that the GC content of the stem regions is
highest in the hyperthermophiles, intermediate in the
moderate thermophiles, and lowest in the mesophiles.
This is true of both the archaeal and the bacterial
groups. The Kruskal-Wallis rank sum tests indicated
that these differences are highly significant (p <
0.0001). In contrast to the stem regions, the nucleotide content of the loop regions does not differ greatly
among the three groups and there is no obvious
correlation between the GC content of the loops and
the optimal growth temperature. Overall, these results confirm previous reports that the GC content of
stem regions is positively correlated with the optimal
growth temperature, while the loop regions of the
same molecules show little or no correlation.
Parallel Trends in Sequence Length of Archaea and
Bacteria
Table 1B shows the results of a different analysis of
the same species groups as reported in Table 1A. In
this case, we calculated the average cumulative sequence length of the stem and loop regions of the
rRNA, rather than the average GC content as in
Table 1A above. We see that the average stem length
is greatest among the hyperthermophiles, intermediate among the moderate thermophiles, and shortest
among the mesophiles. Again, this is true for both the
archaeal and the bacterial groups, and the differences
are highly significant based on the Kruskal-Wallis
rank tests. In contrast to the stem regions, the lengths
of the loop regions do not show the same trends (see
Table 1B) and the differences between groups are not
statistically significant.
Taken together, these results confirm our prediction that thermophilic archaea and bacteria are selected for both higher GC content and longer
sequence length in the stem regions. Our prediction
was based on the fact that both factors (GC content
and length) would result in an increased number of
hydrogen bonds, thus increasing the stability of the
stems at higher growth temperatures.
It is interesting to note that despite the highly
parallel trends in the stem regions of archaeal and
bacterial rRNAs, there is a significant difference in
the overall length of the rRNA molecule between the
two groups. Specifically, the bacterial rRNAs are
longer than their archaeal homologues. This overall
length difference has already been noted (Wuyts et al.
2002) and can be explained by the fact that the bacterial rRNAs have two extra hairpin loop structures.
As can been seen from Table 1B, most of the length
difference between the archaeal and the bacterial
rRNAs is in unpaired loop regions of the molecule.
Phylogeny-Based Analysis of Thermal Adaptation in
Archaeal 16S rRNA
While the vast majority of bacterial species are mesophiles with optimal growth temperatures below
40C, the archaea span a much wider range of growth
temperatures. For this reason we performed a more
detailed analysis of the archaeal sequences. First, we
plotted the GC content of 100 archaeal rRNA sequences against the optimal growth temperatures of
the individual species (see Fig. 1). From Fig. 1 we can
see the apparent strong correlation between the GC
content of the stem regions and the optimal growth
123
Fig. 1. Relationship between the
GC content of 16S rRNA stems
and loops and the optimal growth
temperature. The data are plotted
for 100 archaeal species for which
both complete 16S rRNA
sequences and optimal growth
temperatures are known. The data
for the paired stem regions are
shown as blue diamonds, while the
data for the single-stranded loop
regions are shown as purple
squares. For a full list of species
names and growth temperatures,
see Supplementary Table 2.
Table 2. Relationship between (i) GC content and optimal
growth temperature and (ii) rRNA sequence length and optimal
temperature
Original data
Contrasts based on
translation tree
Contrasts based on
transcription tree
GC
content
Sequence
length
0.90*
0.85*
0.79*
0.84*
0.84*
0.44 (p = 0.043)
Note. The values shown are correlation coefficients. Data from 20
species were used for the original dataset and for the contrasts
based on the translation proteins; data from 18 species were used
for the contrasts based on the transcription proteins (see text for
explanation). *p < 0.001.
temperature. We were concerned, however, that this
apparent correlation might be due to phylogenetic
clumping of the species. In order to eliminate this
possibility, we derived phylogenetically independent
contrasts (see Methods) for (i) GC content of the 16S
rRNA sequence and growth temperature and (ii) sequence length and growth temperature. The results
are given in Table 2. It is clear that the correlations
remain when phylogenetic relationships are taken
into account. Thus, we can conclude that the differences in both sequence composition and sequence
length are due to natural selection rather than historical accident. One of the correlations for the contrasts based on sequence length and temperature is
much lower than the others (see Table 2). This is the
result based on the phylogeny that was built using
transcription related proteins. Upon further investigation, we found that this lowered correlation was
due to two outlier points associated with three
Methanosarcina species. Furthermore, we found that
the database entry for M. acetivorans is a partial
rRNA sequence (from GenBank) which has only
1426 bases, while the closely related M. mazei is
represented by a full-length sequence of 1474 bases.
This results in a huge contrast both between
these two species and between the ancestral node
(M. acetivorans and M. mazei) and the node for
M. barkeri. When the two outlier points were removed the correlation between the rRNA length
contrasts and the temperature contrasts becomes
much stronger (R = 0.88, p < 0.001). This is consistent with the high correlations found when using
the tree based on translation proteins—since this tree
did not contain the Methanosarcina species.
Can We Detect Thermal Adaptation in Phylogenetically Related rRNA Sequences?
The analyses described above show that ribosomal
RNA sequences can change as a result of temperature-based natural selection, but they provide little
idea of how quickly such adaptation can occur. To
address the latter question about the rate of adaptation, we screened the 1573 bacterial species in the
rRNA database for cases where a single genus contained both mesophilic and thermophilic species. Of a
total of 444 genera, we found 19 genera each containing at least one mesophilic species and one thermophilic species (optimal temperature, ‡45C).
Within each genus, we compared the GC contents
and sequence lengths of the rRNA stems between the
mesophilic and the thermophilic species. The results
are given in Table 3. We found that the GC content
of the rRNA stems was higher in the thermophilic
species than in the mesophilic species (p < 0.001 for a
paired two-tail t-test). The stem lengths are also
longer in the thermophilic species than in the mesophilic species (p = 0.018, paired t-test), while the
124
Table 3. Average optimal growth temperature, GC content (%), and length (bases) of 16S rRNA stems of mesophilic species (Meso) and
thermophilic species (Thermo) in a bacterial genus
Temperature (C)
GC content
Sequence length
Genus
Meso
Thermo
Meso
Thermo
Meso
Thermo
Actinomadura (18,1)a
Actinopolyspora (1,1)
Amycolatopsis (7,1)
Bacillus (46,7)
Brevibacillus (6,2)
Clostridium (101,4)
Deinococcus (1,2)
Desulfotomaculum (6,8)
Lactobacillus (45,1)
Mycobacterium (43,2)
Porphyrobacter (1,2)
Pseudonocardia (7,1)
Rubrobacter (1,1)
Saccharomonospora (4,2)
Saccharopolyspora (5,1)
Spirochaeta (7,1)
Streptomyces (67,5)
Sulfobacillus (1,1)
Thermoactinomyces (1,5)
Mean ± SE
30.74
37
28.29
30.03
31.65
34.57
30
34.08
32.97
35.6
30
28.64
37
29.42
28
32.71
28.26
35
35
32.05 ± 0.7
55
45
45
55
47.5
57.34
47.5
56.25
45
45.5
45
45
50
47.43
51.88
65
46.5
45
51.08
49.8 ± 1.3
73.09
73.39
71.21
64.6
65.49
63.07
66.59
64.08
61.49
70.25
65.21
72.22
69.82
71.85
70.89
65.8
71.36
69.5
67.39
68.3 ± 0.84
73.56
74.94
71.77
68.9
66.62
63.19
70.21
68.93
62.9
72.07
65.43
72.34
76.15
73.06
71.87
75
72.87
74.29
70.82
70.8 ± 0.89
873
909
870
907
903
872
874
895
913
884
845
874
888
865
873
891
885
887
897
884.5 ± 4.0
876
916
883
903
912
899
878
901
911
885
852
882
888
867
865
920
891
884
898
890.1 ± 4.20
a
The first number in parentheses is the number of mesophilic species and the second number is the number of thermophilic species in the
genus. For a full list of species names and growth temperatures, see Supplementary Table 1.
difference in loop length is not significant (p = 0.62,
paired t-test) for the 19 genera (data not shown).
Despite the statistical significance of the differences in
the within-genus comparisons, the magnitude of the
differences is much less than that in the broader
phylogenetic comparisons (see Table 1 and Fig. 1).
This reflects both the shorter evolutionary time involved and also the fact that the growth temperature
differences are usually not very large within a single
genus. One genus in which there are large differences
in optimal growth temperatures is Spirochaeta, and
in this case we see correspondingly large divergences
in GC content and sequence length (see Table 3).
Patterns of Nucleotide Composition in Vertebrate
18S rRNA
Having demonstrated the effect of growth temperature on the nucleotide composition and sequence
length for prokaryotic 16S rRNA, we asked if a similar trend can be seen in vertebrates when we compare
18S rRNA sequences from homeotherms and poikilotherms. One might expect the warm-blooded homeotherms (mammals and birds) to show contrasting
patterns to the cold-blooded vertebrates (fish,
amphibians, and reptiles). Indeed, overall genomic
differences in GC content have already been noted for
warm-blooded and cold-blooded vertebrates (Bernardi 2000). In agreement with these genomic averages, we found that the average GC content of 18S
rRNA sequences among the warm-blooded mammals
and birds, approximately 55.7%, is moderately higher
than that of the cold-blooded vertebrates, approximately 53.5% (see Table 4). When we partition the
18S sequences into stem and loop regions, however,
we see that, in contrast to the prokaryotic data set, the
differences are not concentrated in the stem regions of
the molecule. For example, the GC content of the
rRNA stems in warm-blooded birds is closer to that of
the cold-blooded amphibians than it is to the warmblooded mammals (Table 4).
Variations in rRNA sequence length are also reported in Table 4. Again, there is no obvious
grouping of the warm-blooded and cold-blooded
classes. Instead, there is a highly significant difference
in overall sequence length within the warm-blooded
vertebrates, i.e., between mammals and birds. Specifically, the 18S rRNA length is greater in mammals
than in birds and the difference is highly significant
(p < 0.0001), whereas the average length difference
between the warm- and the cold-blooded vertebrates
is only marginally significant (p = 0.049). When we
score the stem and loop lengths separately, we see
even more variation within the groups and no overall
difference between them. For instance, the coldblooded fish have a longer stem length than the
warm-blooded birds. It is interesting to note that the
increase in length of the vertebrate 18S rRNA compared to the prokaryotic 16S rRNA is mainly in the
loop regions of the molecules.
125
Table 4. GC content and sequence length of 18S rRNA among five classes of vertebrates
Sequence composition (GC%)
Vertebrate class
Fish (n = 23)
Amphibian (n = 18)
Reptiles (n = 5)
Aves (n = 34)
Mammal (n = 4)
a
Stems
62.1
62.6
61.8
63.2
64.1
±
±
±
±
±
Loops
a
0.22
0.21
0.21
0.08
0.19
43.8
42.5
44.5
47.4
46.4
±
±
±
±
±
0.28
0.31
0.21
0.17
0.12
Sequence length (bases)
Average
Stems
53.4
53.0
53.6
55.6
55.8
956.3
901.1
900.4
945.6
998.6
±
±
±
±
±
0.23
0.24
0.15
0.12
0.18
±
±
±
±
±
2.47
9.37
14.5
3.37
4.75
Loops
Total
869.0±1.78
927.4±8.04
913.0±13.06
876.9±2.96
870.5±3.38
1825.3
1828.6
1813.4
1822.4
1869.3
±
±
±
±
±
2.83
6.18
1.66
0.48
2.32
Mean ± standard error.
Fig. 2. Relationship between GC
content of small subunit rRNA
stems and growth temperature. The
trend line for the archaeal species
(see Fig. 1) is repeated here for
reference. The averages for the
three groups of bacteria
(mesophilic, thermophilic, and
hyperthermophilic) are shown as
purple squares. The averages for
the five vertebrate classes are
shown as blue circles. Standard
errors are included.
Discussion
In this study, we confirmed the previous findings of a
strong positive correlation between the GC content of
16S ribosomal RNA and the environmental growth
temperature, and we showed that this correlation is
not merely an artifact of phylogenetic relatedness. We
also found evidence of a positive relationship between
the length of rRNA stems and the temperature. Since
the same patterns were repeated within the archaeal
and bacterial lineages (see Table 1 and Fig. 2), we
can conclude that the positive relationship is not due
to phylogenetic history but reflects a repeated selective response to elevated environmental temperature.
This conclusion is supported by the phylogenetically
independent contrasts of archaeal species. The fact
that the increased GC content and increased sequence
length are both concentrated in the paired stem regions of the molecule provides further evidence for
increasing selection pressure to maintain the folded
structure of the rRNA molecule at higher temperatures.
Taken together, these results indicate that prokaryotic 16S rRNAs respond to increased environmental growth temperature by increasing the
structural stability of their rRNAs. This is achieved
by increasing both the GC content and the length of
the paired regions. Both of these factors (increased
GC and increased length of paired regions) increase
the number of hydrogen bonds between the paired
strands. Thus it is reasonable to interpret these
changes as adaptations to growth at high temperature. A previous study (Galtier and Lobry 1997) has
shown that the GC contents of several structural
RNAs in prokaryotes also positively correlated with
the optimal growth temperature. Based on the results
presented here for the 16S small subunit rRNA, it is
tempting to speculate that the stem lengths of the
large subunit 23S rRNA genes are also correlated
with the optimal growth temperature. This remains to
126
be confirmed, however, once a larger database of 23S
rRNA sequences becomes available.
In contrast to the data on mesophilic and thermophilic prokaryotes, the differences between
warm-blooded and cold-blooded vertebrates are not
large, and more significantly, they are not concentrated in the paired stem regions of the molecule
(see Table 4). At first glance, this seems to contradict the findings on prokaryotes, but when we
consider that the body temperatures of mammals
(37C) and birds (39C) are not even within the
‘‘moderately thermophilic’’ range of prokaryotes,
the result is not at all surprising. Moreover, given
that most species of ‘‘cold-blooded’’ snakes prefer
to maintain an active body heat at about
30C—and up to 40C for desert reptiles—the
temperature differences between the two groups are
not very clear-cut. In fact, there is a much greater
difference between the average body temperature of
fish and reptiles than there is between reptiles and
mammals. Thus, the results for the vertebrates, although negative, are entirely consistent with the
results for the bacteria and archaea.
Despite the lack of obvious temperature-induced
differences between the 18S rRNA sequences of
warm-blooded and cold-blooded vertebrates, these
sequences do illustrate many of the same general
features seen in the prokaryotic 16S rRNAs. For instance, the paired regions are relatively GC-rich,
while the rRNA loops of all five groups of vertebrates
have a very high content of adenine (Wang 2005).
This is reminiscent of the elevated amount of adenine
in prokaryotic 16S rRNA loops in both mesophilic
and thermophilic prokaryotes (Wang and Hickey
2002). In fact, this compositional bias also exists in
other eukaryotic species, including protists, fungi,
invertebrates, and plants (Wang 2005). There is
existing evidence that adenine contributes to the
stability of the single-stranded regions of the molecule (Gutell et al. 2000).
Finally, the fact that we see small but significant
divergences in sequence composition between related
bacteria that have contrasting optimal growth temperatures indicates that these molecular adaptations
to the elevated growth temperatures can occur over a
relatively short evolutionary time span.
View publication stats
Acknowledgments.
This work was supported by a Research
Grant from NSERC Canada (D.A.H.) and an Ontario Graduate
Scholarship (H.C.W.). We thank Dr. N. Galtier and two reviewers
for their comments.
References
Bernardi G (2000) The compositional evolution of vertebrate genomes. Gene 259:31–43
Brochier C, Forterre P, Gribaldo S (2004) Archaeal phylogeny
based on proteins of the transcription and translation machineries:tackling the Methanopyrus kandleri paradox. Genome
Biol 5(3):R17
Dalgaard JZ, Garrett RA (1993) Archaeal hyperthermophile genes.
In: Kates M, Kushner DJ, Matheson AT (eds) The biochemistry of Archaea (Archaebacteria). Elsevier, Amsterdam, p 535
Felsenstein J (1985) Phylogeny and the comparative method. Am
Nat 125:1–15
Foster PG, Hickey DA (1999) Compositional bias may affect both
DNA-based and protein-based phylogenetic reconstructions. J
Mol Evol 48:284–290
Galagan JE, Nusbaum C, Roy A, et al. (2002) The genome of M.
acetivorans reveals extensive metabolic and physiological
diversity. Genome Res 12:532–542
Galtier N, Lobry JR (1997) Relationships between genomic GC
content, RNA secondary structures and optimal growth temperature in prokaryotes. J Mol Evol 44:632–636
Gutell RR, Cannone JJ, Shang Z, Du Y, Serra MJ (2000) A story:
unpaired adenosine bases in ribosomal RNAs. J Mol Biol
304:335–354
Harvey PH, Pagel MD (1991) The comparative method in evolutionary biology. Oxford University Press, Oxford
Hasegawa M, Hashimoto T (1993) Ribosomal RNA trees misleading? Nature 361:23
Hurst LD, Merchant AR (2001) High guanine-cytosine content is
not an adaptation to high temperature: a comparative analysis
amongst prokaryotes. Proc R Soc Lond B 268:493–497
Nakashima H, Fukuchi S, Nishikawa K (2003) Compositional
changes in RNA, DNA and proteins for bacterial adaptation to
higher and lower temperatures. J Biochem 133:507–513
Singer GAC, Hickey DA (2003) Thermophilic prokaryotes have
characteristic patterns of codon usage, amino acid composition
and nucleotide content. Gene 317:39–47
Van de Peer Y, De Rijk P, Wuyts J, Winkelmans T, De Wachter R
(2000) The European small subunit ribosomal RNA database.
Nucleic Acids Res. 28:175–176
Wang H-C (2005) The effects of nucleotide bias on genome evolution. PhD thesis. University of Ottawa, Ottawa
Wang H-C, Hickey DA (2002) Evidence for strong selective constraint acting on the nucleotide composition of 16S ribosomal
RNA genes. Nucleic Acids Res 30:2501–2507
Wuyts J, Van de Peer Y, Winkelmans T, De Wachter R (2002) The
European database on small subunit ribosomal RNA. Nucleic
Acids Res 30:183–185