• View in gallery

    A predicted secondary structure of Sepik virus. The folding pattern was obtained by M-fold program of Zuker and others,14 using the first 200 nucleotides in the 5′-NCR and full length of the 3′-NCR. Two arrows (5′ and 3′ ) point to where the 5′- and 3′-termini of the genome are placed next to each other (but not linked). The 5′-NCR sequence is read downward from the 5′ arrow clockwise along the partially double-strand structure; and the 3′-NCR sequence is read upstream counterclockwise from the 3′ arrow. CS, conserved sequence; 3′-CYC, cycling sequence within CS1; 5′-CYC, cycling sequence within capsid gene; RSEP, tandem repeat sequence of SEPV; 3′-LSH, long stable hairpin structure near the 3′-terminal segment of the genome.

  • View in gallery

    Bootscanning of the complete ORFs of flaviviruses. The entire ORF is scanned across the x-axis. Percentage of permuted trees is shown on the y-axis. Query sequence of YFV is an unmarked horizontal line at 100% permutation across ORF. Peaks above 50% permutation labeled S and E indicate Sepik virus and Entebbe bat virus, respectively. Virus sequences used: Apoi virus, Bagaza virus, cell fusing agent virus, DENV-4, Entebbe bat virus, Kamiti River virus, Kedougou virus, Murray Valley encephalitis virus, Powassan virus, Sepik virus, St Louis encephalitis virus, Tamana bat virus, West Nile virus, Yokose virus, yellow fever virus, and Zika virus.

  • View in gallery

    Bayesian phylogenetic tree of flaviviruses based on the ORF sequences. The long horizontal distance between the distantly related insect flaviviruses (CFAV and KRV) and the rest of viruses was shortened for visual presentation and, thus, is not on scale. The probabilities of support of all nodes except one are 1.00 (or 100%) and are not shown. The probability < 1.00 of the exceptional node (for IGUV-BSQV-KOKV) is shown on the phylogram. ALKV, Alkhurma virus; APOIV, Apoi virus; BAGV, Bagaza virus; BSQV, Bussuquara virus; CFAV, cell fusing agent virus; DENV, dengue virus; DTV, deer tick virus; ENTV, Entebbe bat virus; IGUV, Iguape virus; ILHV, Ilheus virus; JEV, Japanese encephalitis virus; KEDV, Kedougou virus; KOKV, Kokobera virus; KRV, Kamiti River virus; LGTV, Langat virus; LIV, louping ill virus; MMLV, Montana myotis leukoencephalitis virus; MODV, Modoc virus; MVEV, Murray Valley encephalitis virus; OHFV, Omsk hemorrhagic fever virus; POWV, Powassan virus; RBV, Rio Bravo virus; ROCV, Rocio virus; SEPV, Sepik virus; SLEV, St Louis encephalitis virus; TBEV, tick-borne encephalitis virus; USUV, Usutu virus; WNV, West Nile virus; YFV, yellow fever virus; YOKV, Yokose virus; ZIKV, Zika virus.

  • 1

    International Committee on the Taxonomy of Viruses (ICTV), 2005. Virus Taxonomy: Classification and Nomenclature of Viruses. San Diego: Elsevier.

  • 2

    Strode GE, 1951. Yellow Fever. New York: McGraw-Hill Book Co.

  • 3

    Dennis LH, Reisberg BE, Crosbie J, Crozier D, Conrad ME, 1969. The original haemorrhagic fever: Yellow fever. Br J Haematol 17 :455–462.

    • Search Google Scholar
    • Export Citation
  • 4

    Calisher CH, Karabatsos K, Dalrymple JM, Shope RE, Porter-field JS, Westaway EG, Brandt WE, 1989. Antigenic relationship between flaviviruses as determined by cross-neutralization tests with polyclonal antisera. J Gen Virol 70 :37–43.

    • Search Google Scholar
    • Export Citation
  • 5

    Kuno G, Chang G-JJ, Tsuchiya KR, Karabatsos N, Cropp CB, 1998. Phylogeny of the genus Flavivirus. J Virol 72 :73–83.

  • 6

    Tajima S, Takasaki T, Matsuno S, Nakayama M, Kurane I, 2005. Genetic characterization of Yokose virus, a flavivirus isolated from bat in Japan. Virology 332 :38–44.

    • Search Google Scholar
    • Export Citation
  • 7

    Pierre V, Drouet M-T, Deubel V, 1994. Identification of mosquito-borne flavivirus sequences using universal primers and reverse transcription/polymerase chain reaction. Res Virol 145 :179–188.

    • Search Google Scholar
    • Export Citation
  • 8

    Chang G-JJ, 1997. Molecular biology of dengue viruses. Gubler DJ, Kuno G, eds. Dengue and Dengue Hemorrhagic Fever. Wallingford, UK: CAB International, 175–198.

  • 9

    Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DJ, 1997. The Clustal X window interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res 25 :4876–4882.

    • Search Google Scholar
    • Export Citation
  • 10

    Hall TA, 2001. BioEdit. Raleigh, NC: Department of Microbiology, North Carolina State University.

  • 11

    Nicholas KB, Nicholas HB, Deerfield DW, 1997. GeneDoc: Analysis and visualization of genetic variation. EMBONet News 4 :14–18.

  • 12

    Rice CM, Strauss JH, 1990. Production of flavivirus polypeptides by proteolytic processing. Seminar Virol 1 :357–367.

  • 13

    Chang G-JJ, Hunt AR, Davis B, 2000. A single intramuscular injection of recombinant plasmid DNA induces protective immunity and prevents Japanese encephalitis in mice. J Virol 74 :4244–4252.

    • Search Google Scholar
    • Export Citation
  • 14

    Zuker M, Mathews DH, Turner DH, 1999. Algorithms and thermodynamics for RNA secondary structure prediction: a practical guide. Barciszewski J, Clark BFC, eds. RNA biochemistry and biotechnology. Amsterdam: Kluwer Academic Publishers, 11–43.

  • 15

    Khromykh AA, Meka H, Guyatt KJ, Westaway EG, 2001. Essential role of cyclization sequences in flavivirus RNA replication. J Virol 75 :6719–6728.

    • Search Google Scholar
    • Export Citation
  • 16

    Ray SC, 1999. Simplot. Baltimore: Johns Hopkins University.

  • 17

    Salminen MO, Carr JK, Burke DS, McCutchan FE, 1995. Identification of breakpoints in intergenotypic recombinants of HIV type 1 by bootscanning. AIDS Res Hum Retroviruses 11 :1423–1425.

    • Search Google Scholar
    • Export Citation
  • 18

    Felsenstein J, 1995. PHYLIP, Version 3.57c. Seattle, WA: Department of Genetics, University of Washington.

  • 19

    Kuno G, Chang G-JJ, 2005. Biological transmission of arboviruses: reexamination of and new insights into components, mechanisms, and unique traits as well as their evolutionary trends. Clin Microbiol Rev 18 :608–637.

    • Search Google Scholar
    • Export Citation
  • 20

    Huelsenbeck JP, Ronquist F, 2001. MrBayes: Bayesian inference phylogenetic trees. Bioinformatics 17 :754–755.

  • 21

    Page RDM, 1996. Treeview: an application to display phylogenetic trees on personal computer. CABIOS 12 :357–358.

  • 22

    Hahn CS, Hahn YS, Rice CM, Lee E, Dalgarno L, Strauss EG, Strauss JH, 1987. Conserved elements in the 3′ untranslated region of flavivirus RNAs and potential cyclization sequences. J Mol Biol 198 :33–41.

    • Search Google Scholar
    • Export Citation
  • 23

    Mutebi J-P, Rijnbrand RCA, Wang H, Ryman KD, Wang E, Fulop LD, Titball R, Barrett ADT, 2004. Genetic relationships and evolution of genotypes of yellow fever virus and other members of the yellow fever virus group within the Flavivirus genus based on the 3′ noncoding region. J Virol 78 :9652–9665.

    • Search Google Scholar
    • Export Citation
  • 24

    Proutski V, Gould EA, Holmes EC, 1997. Secondary structure of the 3′ untranslated region of flaviviruses: Similarities and differences. Nucleic Acids Res 25 :1194–1202.

    • Search Google Scholar
    • Export Citation
  • 25

    Rice CM, Lenches EM, Eddy SR, Shin SJ, Sheets RL, Strauss JH, 1985. Nucleotide sequence of yellow fever virus: implications for flavivirus gene expression and evolution. Science 229 :726–733.

    • Search Google Scholar
    • Export Citation
  • 26

    Wang E, Weaver SC, Shope RE, Tesh RB, Watts DM, Barrett ADT, 1996. Genetic variation in yellow fever virus: duplication in the 3′ noncoding region of strains from Africa. Virology 225 :274–281.

    • Search Google Scholar
    • Export Citation
  • 27

    Varelas-Wesley I, Calisher CH, 1982. Antigenic relationships of flaviviruses with undetermined arthropod-borne status. Am J Trop Med Hyg 31 :1273–1284.

    • Search Google Scholar
    • Export Citation
  • 28

    Tabachnick WJ, 1991. Evolutionary genetics and arthropod-borne diseases. The yellow fever mosquito. Am Entomol 37 :14–24.

  • 29

    Hugot JP, Gonzalez JP, Denys C, 2001. Evolution of the Old World Arenaviridae and their rodent hosts: Generalized host-transfer or association by descent? Infect Genet Evol 1 :13–20.

    • Search Google Scholar
    • Export Citation
  • 30

    Losos JB, Glor RE, 2003. Phylogenetic comparative methods and the geography of speciation. Trends Ecol Evol 18 :220–227.

  • 31

    Sabin AB, 1959. Survey of knowledge and problems in field of arthropod- borne virus infections. Arch Virusforsch 9 :1–10.

  • 32

    Karabatsos N, 1985. International Catalogue of Arboviruses. Third edition. San Antonio, TX: American Society of Tropical Medicine and Hygiene.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

CHARACTERIZATION OF SEPIK AND ENTEBBE BAT VIRUSES CLOSELY RELATED TO YELLOW FEVER VIRUS

View More View Less
  • 1 Arbovirus Diseases Branch, Division of Vector-Borne Infectious Diseases National Center for Zoonotic, Vector-Borne, and Enteric Diseases, Centers for Disease Control and Prevention, Fort Collins, Colorado

Yellow fever virus has a special place in medical history as the first animal virus isolated and as the prototype virus in the genus Flavivirus, which contains many serious human pathogens. Only recently, its closely related viruses within the group were identified phylogenetically. In this study, we obtained complete or near complete genome sequences of two viruses most closely related to yellow fever virus: Sepik virus of Papua New Guinea and Entebbe bat virus of Africa. Based on full-genomic characterization and genomic traits among related viruses, we identified Sepik virus to be most closely related to yellow fever virus and analyzed the pattern of repeat and conserved sequence motifs in the 3′-noncoding region among the members of yellow fever virus cluster. We also discuss the geographic dispersal as a part of ecological traits of this lineage of flaviviruses.

INTRODUCTION

Many members of the genus Flavivirus, including yellow fever virus (YFV), dengue virus types 1 to 4 (DENV-1 to -4), Japanese encephalitis virus (JEV), Murray Valley encephalitis virus (MVEV), St Louis encephalitis virus (SLEV), tick-borne encephalitis (TBEV), and West Nile virus (WNV), are serious pathogens, collectively causing many millions of infections in humans annually. From the standpoint of evolution, this group of viruses is unique in that it comprises four groups regarding the type of host association: insect viruses that replicate only in insects (mosquitoes), strictly vertebrate viruses without a capacity to replicate in arthropods, tick-borne viruses that replicate in ticks and vertebrates, and mosquito-borne viruses that replicate in mosquitoes and vertebrates.

YFV, the prototype of this group of related viruses (hereafter called flaviviruses) for the genus Flavivirus,1 occupies an unparalleled position in the history of medical virology. It is the first animal virus scientifically identified to be etiologic agent of human disease during the celebrated study of yellow fever almost a century ago; the first virus confirmed to be transmitted by arthropod vectors; and, retrospectively, as the first hemorrhagic fever-causing virus before the term viral “hemorrhagic fever” came into general use in early 1950s.2,3 Interestingly, in the subsequent classification based on various serologic techniques, YFV was not affiliated with any member of flaviviruses, whereas most other members could be grouped together in antigenic complexes, such as JEV and TBEV complexes.4 For this reason, it was unexpected when the first comprehensive phylogenetic study revealed a close relationship of YFV with Sepik virus (SEPV) of Papau New Guinea, Entebbe bat virus (ENTV) of Africa, Sokuluk virus (SOKV) of Central Asia, and Yokose virus (YOKV) of Far East Asia.5 Later, Wesselsbron virus (WESSV) was also found to be affiliated with YFV. In addition, the aforementioned study also revealed that the major branch that diverges closest from the root of the mosquito-borne group in the phylogenetic tree was composed of three minor branches: a branch containing Entebbe bat virus (ENTV), Sokuluk virus (SOKV), and Yokose virus (YOKV); a branch containing YFV and Sepik virus (SEPV); and a more distantly related branch containing several African viruses (Bouboui virus, Uganda S virus, Banzi virus, Saboya virus, and Potiskum virus) and Austro-Asian viruses (Edge Hill and Jugra viruses). As defined by the current International Committee on Taxonomy of Viruses (ICTV) classification, the viruses in all three minor branches, as well as Wesselsbron virus, are classified as members of YFV group.1

Because the previous phylogenetic study was based on 1-kb segments of the NS5 gene encompassing < 10% of the whole viral genome, it was uncertain if the closer relationship of YFV with those unexpected viruses was biased because of that segment of gene selected for the study. In this study, we focused on the first two minor branches of viruses that had been previously found to be genetically most closely related to YFV.5 Hereafter, these viruses in the first two minor branches are called the members of “YFV cluster” to distinguish them from the members in the aforementioned third minor branch. In this study, we obtained a full genome sequence of SEPV and a near complete sequence of ENTV. Together with the full-genome sequencing of YOKV reported by others,6 we characterized genomic traits of those viruses in comparison with YFV. The data obtained in this study will be useful for discussing not only the evolution of the viruses in the YFV cluster but also for speculating on the history of dispersal of the mosquito-borne flaviviruses in the world in general.

MATERIALS AND METHODS

Viruses.

Suckling mouse brain specimens (one or two passages) of SEPV (strain MK7148) and ENTV (strain UgIL-30) were obtained from the World Health Organization (WHO) Reference Center in the Division of Vector-Borne Infectious Diseases of the Centers for Disease Control (CDC).

RT-PCR and sequencing.

Viral RNA was extracted directly from infected mouse brain suspension with the QIAmp Viral RNA Mini Kit (Qiagen, Valencia, CA). cDNA was prepared by first incubating 14 μL viral RNA and 1 μL reverse primer and rapidly cooling on ice. For sequencing the genomic region between the 5′-end of genome and conserved sequence (CS2) in the 3′-noncoding region (3′-NCR), primer (VD8)7 was used for cDNA synthesis. A reverse transcription (RT) mixture containing 5 μL RT buffer (5×), 1 μL deoxy-nucleotide mix (dNTPs; 10 mmol/L of each base), 6 units of a reverse transcriptase (RAV-2; Amersham-Pharmacia Bio-tech, Piscataway, NJ), 1 μL reverse primer (100 μmol/L), and water for a total volume of 35 μL was added to 15 μL of heat-treated viral RNA, and the mixture was incubated at 45°C for 45 minutes. Polymerase chain reaction (PCR) was performed using Expand Long Template PCR System kit (Roche Applied Science, Indianapolis, IN). An aliquot (12 μL) cDNA was mixed with 88 μL of reaction mixture containing 10 μL PCR buffer (10×), 2 μL dNTP mix (10 mmol/L of each base), 1 μL each of forward and reverse primers (50 μmol/L), 2 μL Expand Long DNA polymerase, and 72 μL water. The thermocycling program set up in Gene Amp PCR System 9600 thermocycler (Perkin-Elmer, Norwalk, CT) was 1 cycle of 94°C-1 minute/50°C-1 minute/68°C-5 minutes; 3 cycles of 94°C-20 seconds/50°C-1 minute/68°C-4 minute; 10 cycles of 94°C-20 seconds/50°C-30 seconds/68°C-4 minutes, with an increment of 20 seconds per cycle; and 1 cycle of extension at 68°C for 7 minutes. Most of the primers used were designed primarily based on the conserved amino acid motifs among mosquito-borne flaviviruses.8 Amplicons were purified with a Centricep column (Princeton Separations, Adelphia, NJ), and aliquots of ~60–160 ng of the purified DNA templates were used for direct cycle sequencing using PRISM DNA sequencing kit (Big Dye) for dye terminator cycle sequencing with Ampli-Taq FS enzyme (ABI, Foster City, CA), as described previously5 and CEQ 8000 Genetic Analysis System (Beckman Coulter, Fullerton, CA).

5′-prime and 3-prime ends were amplified using 5′-RACE and 3′-RACE kits (Life Technologies, Rockville, MD), respectively. For 3′-RACE, in addition to Adoptor Primer (5′-GGCCACGCGTCGACTAC[T17]-3′) supplied in the kit, the following reverse primers were also used: 5′-GCATGCG-GCCGC[T18]AGT-3′; 5′-GCATGCGGCCGC[T18]AGA-3′; 5′- GCATGCGGCCGC [T18]AGC-3′; 5′- GCATGCGGCC-GC[T18]AG-3′. Complete genomes were sequenced in both directions by primer walking, using pairs of primers selected from various regions of flavivirus genome. The sequences of all primers used for amplification and/or sequencing in this study will be provided from the authors on request.

Sequence alignment.

The accession numbers of the full-length genome sequences deposited at GenBank are as follows: YFV (X03700); ENTV (DQ837641, including the ORF sequence AY632537); SEPV (DQ837642, including the ORF sequence AY632543); and YOKV (NC005039). The open reading frames (ORFs) of those viruses were aligned first by Clustal X9 followed by manual adjustment with BioEdit (version 5.0.0).10 We applied the “ReGap DNA project” function in the GeneDoc program11 to generate a properly aligned nucleotide sequence file.

Cleavage site determination.

Most cleavage sites were identified by following the proteolytic processing scheme for the flavivirus ORFs previously established by Rice and Strauss.12 Junctions of intracellular capsid and premembrane (Ci/prM), premembrane and envelope (prM/E), and envelope and non-structural protein 1 (E/NS1) proteins processed by the host cellular signalase were determined on the basis of the highest cleavage potential score using a computer program SignalP-NN (http://www.cbs.dtu.dk/services/).13

Secondary structure in 3′-NCR and genome cyclization.

The secondary structure in the 3′-NCR and cyclization between 3′- and 5′-terminal regions of SEPV were studied using the M-fold program,14 based on the first 200 nucleotides in the 5′-terminal sequence and the entire length of 3′-terminal sequence after the stop codon of the NS5 gene, similar to the study by Khromykh and others.15

Bootscanning of the ORF.

As one of the two methods used in this study for analyzing the phylogenetic relationship among the viruses, the Bootscan program in SimPlot program package was used to examine sequence relatedness of the YFV cluster viruses across ORF.16,17 ORFs of selected viruses were scanned against a query virus with a window size of 600 nt to obtain 100 phylogenetic tree replicas using SEQBOOT and DNAPARS of PHYLIP program.18 This procedure was repeated by sliding 10 nt per step for the entire ORF; the bootstrap supports, indicating phylogenetic relatedness among reference viruses and query virus, were tabulated and plotted for each steps. The degree of bootstrap support is expressed on the y-axis as percent permuted trees across the entire length of ORF on the x-axis. The ORF sequences used in this study are the same as those used in our previous study.19

Bayesian inference.

Phylogenetic relationship was also studied by Bayesian inference. Using MrBayes program (version 3.1),20 we implemented Markov chain Monte Carlo (MCMC) convergence acceleration of Metropolis coupling technique and adopted the general time reversible model to infer the phylogram that maximizes the posterior probabilities of the branch relationship. As in bootscanning, the 34 ORF sequences analyzed were the same as those used in our previous study.19 The sequences of two insect flaviviruses (CFAV and KRV) were used as outgroups. The majority consensus tree, including branch lengths and posterior clade probability, was calculated from one million MCMC simulations. The phylogenetic tree was produced using Tree view program (version 3.1).21

RESULTS

Full-length genome.

The full genome length of SEPV is 10,793 nucleotides, with an ORF encoding 3,405 amino acids, which is shorter than the lengths of YOKV (3,425 amino acids; Table 1) or YFV (3,411 amino acids). Repeated attempts failed to resolve sequence inconsistencies of the 3′-terminal sequence in the 3′-NCR of ENTV experienced in this study. However, we are able to report a near complete sequence (10,510 nucleotides) for ENTV, down to 11 nucleotides downstream from the end of the CS1 in the 3′-NCR. The ORF of ENTV is the same length as the coding region of YFV and encodes 3,411 amino acids. The breakdown of the lengths of genes and genomic regions among the three viruses (Table 1) reveals that the longer ORF length of YOKV is mainly attributed to its longer capsid gene size, because the lengths of other genes are similar. In terms of amino acid identity, SEPV is closest to YFV in all genes, with the only exception of 2K (Table 1).

Cleavage sites.

The cleavage sites of four viruses (including YFV) are listed in Table 2. The amino-termini (N termini) of all sites expected to be cleaved by viral serine protease (virion capsid/intracellular capsid [Cv/Ci], NS2A/NS2B, NS2B/NS3, NS3/NS4A, NS4A/2K, and NS4B/NS5) and host-encoded furin-like protease (pr/M) generally follow C-terminal two basic amino acids (most typically KR, RR, or QR). In the sites cleaved by host signalase, the C-terminal and N-terminal amino acids immediately flanking M/E, E/NS1, and 2K/NS4B are generally similar. The NS1/NS2A site, which is believed to be cleaved by an unknown cellular enzyme, generally follows the sequence V-X-A (in which X is variable) defined by Rice and Strauss.12

Glycosylation sites and cystein residues.

In the prM gene, the number of potential N-linked glycosylation sites is two for ENTV (residues 17–19 and 33–35), SEPV (residues 14–16 and 30–32), and YOKV (residues 17–19 and 31–33), whereas YFV has three potentially functional sites (at residues 13–15, 29–31, and 51–53), with one nonfunctional site in the C-terminal domain (at residues 145–147). All sites of the three viruses, except one in YOKV (residues 31–33), are aligned well with the corresponding sites in YFV. In the E gene, ENTV has two potentially functional sites (residues 64–66 and 467–469), as in YFV (which has functional residues at 304–306 and nonfunctional residues at 465–467), whereas SEPV and YOKV have one site (at residues 154–156 and 467–469, respectively). One of the two sites in ENTV (residues 64–66) is not aligned with the others. In the NS1 gene, both SEPV and YOKV have two potentially functional sites (at residues 130–132 and 208–210 for both viruses), like YFV. In contrast, ENTV has four potentially functional sites (residues 106–108, 130–132, 208–210, and 326–328), two of which (residues 106–108 and 326–328) are not aligned with others. The cysteine residues (6 in prM, 12 in E, and 12 in NS1) in all three viruses are well conserved and are aligned among four viruses in the YFV cluster.

5′- and 3′-NCRs.

The tandem repeat sequences and CSs in the 3′-NCR of SEPV and ENTV were studied primarily using the nomenclature originally reported by Hahn and others.22 The 15-nucleotide sequence (5′-AACCGGGATA[T/A/C]AAAC-3′) in the middle of the tandem repeats of YFV (RYFs), which is also shared by other members of the YFV group, was later designated “core sequence” by Mutebi and others.23 The organization of the secondary structures was studied using the reference system developed by Proutski and others.24 When 3′-NCR is followed in the 5′ to 3′ direction, SEPV has two tandem repeats (RSEP1, RSEP2) comparable with those of RYF shown earlier,25 and a closer examination revealed the third imperfect or vestigial repeat (RSEP3) preceding CS2 (Figure 1; Table 3). As shown in Figure 1, like the 3′-terminal sequence that assumes a long stable hairpin (3′-LSH), CS2 and repeat sequences, as well as 5′-NCR, are also involved in secondary structure. Although the M-fold program generates energetic-favored multiple folding patterns of secondary structure, cyclization between complementary sequences (3′-CYC and 5′-CYC) apparently restricts the number of possible folding patterns more in the 3′-NCR sequence downstream than upstream of the 3′-CYC (Figure 1). Unlike that of SEPV, the 3′-NCR organization of ENTV has only one imperfect repeat YFV sequence (ImRYF) followed by CS2 and CS1. The 3′-NCR organization of YOKV is the simplest among the four viruses in the YFV cluster, because it has only CS2 and CS1.

As reported earlier, the sequence variation among three repeats in YFV (RYFs) is very small.25 In contrast, the two tandem repeat sequences of SEPV,23 while generally preserving the core sequence of RYF,25 have substitutions, insertions, and/or deletions and require gap insertion for proper alignment (Table 3). The third sequence (RSEP3) is most different but can be considered a highly degenerate sequence of RYF. The only sequence in ENTV that has a remote similarity to RYF is a series of noncontiguous segments that collectively resembles that of RYF (Table 3). At the far end of the spectrum of the variation in the YFV cluster is YOKV, which has no similar sequence motif.

Bootscanning of ORFs.

In this analysis, sequences of a group of selected viruses were scanned against the query sequence of any one of the four members of YFV cluster across the entire length of ORF for detecting genomic relatedness. Because the data are voluminous, only the relevant data useful for essential conclusion are presented here. As shown in Figure 2, when YFV is selected as query, the only virus with a significant (> 80% permuted trees) genetic relatedness in multiple genes is SEPV; the moderate (50–65%) range of relatedness shown by ENTV in 5′-terminal segment of the E gene is not considered very strong . When the SEPV sequence is used as query, significant levels of genetic relatedness in multiple genes are shown only by YFV (data not shown). When ENTV is used as query, only moderate levels of genetic relatedness (60–80%) in multiple genes are shown by YFV (data not shown).

Bayesian inference of phylogenetic relationship.

The ORF tree (Figure 3) shows the tree positions of the four YFV cluster viruses located in the lowest major branch within the mosquito-borne group. As shown in Figure 3, this group of viruses diverged into two minor branches, one containing YFV and SEPV and the other containing ENTV and YOKV.

DISCUSSION

As shown in amino acid identities obtained in this study and as revealed previously in a 3′-NCR sequence study,23 among the available sequences, the virus closest to YFV in the YFV cluster is SEPV; ENTV and YOKV are more distantly related. This corresponds well to the division of the YFV cluster members into two minor branches that was shown in this study. The inclusion of the third minor branch containing several viruses in the broader taxonomic category as YFV group by the ICTV1 was also confirmed by others because the members of the third minor branch (Uganda S and Banzi viruses) share the tandem repeat sequence of YFV (RYF) uniquely found only among YFV-related viruses.23 An intriguing pattern of tandem repeat sequences of YFV (RYF) has been observed among the geographic strains of the virus. Past studies revealed that the numbers of such repeats found in the wild types in Western Africa, Central/East Africa, and South America were 3, 2, and 1, respectively.23,26 According to a theory, the pattern of South American genotype evolved by loss of RYF after introduction from West Africa, whereas the current West African genotype evolved by gaining RYF through duplication.23,26 The results obtained in this study also suggest a similar directional trend among the four members of the YFV cluster. If the trend was progressive loss, the three-RYF pattern was basically (but not completely) retained in SEPV, which shares the same minor branch with YFV (Figure 3). However, the repeat sequence could have been lost much earlier for the members in other minor branch. Thus, ENTV retained only a vestigial signal, whereas YOKV lost it entirely. On the other hand, if it was progressive gain, the patterns in ENTV and YOKV would be considered more ancestral than those in YFV or SEPV.

Biologic traits are also useful for studying the evolutionary trend among the members of the YFV cluster. In terms of the extent of epidemic transmission, YFV is an actively proliferating virus in two continents, causing repeated outbreaks of disease. In contrast, although SEPV has been occasionally isolated from mosquitoes in the past four decades, this virus has shown thus far little disease-causing potential and has remained only in Papua New Guinea. Regarding the viruses in other minor branch, neither ENTV nor YOKV has been isolated for nearly half a century. Furthermore, unlike mosquito-borne YFV and SEPV, ENTV, YOKV, and Sokuluk virus (SOKV) have been isolated only from bats; no arthropod vector has ever been found from them. Interestingly, in contrast to the truly vertebrate viruses without a vector that cannot replicate in vector cells (no-vector group), these three bat-associated viruses in the YFV cluster still replicate in mosquito cells.27 This host range specificity is compatible with the grouping of those bat-associated viruses within the mosquito-borne group in the phylogram (Figure 3). Two possibilities have been raised in the past regarding the inability to identify vectors for those bat-associated viruses in the mosquito-borne group: a bias caused by incomplete field study to find the real vector and regressive secondary loss of vector association of formerly mosquito-borne viruses.5

Historically, interest in the geographic origin of YFV in the context of YF spread to the New World has been strong in tropical medicine. African origin of YFV has also been strongly supported by a theory that the domestication of the initially sylvatic Aedes aegypti, leading to the evolution of urban YFV vector originally occurred in Africa.28 More recently, multiple phylogenetic studies of flaviviruses also confirmed the tree position of YFV in a branch closest to the root of the mosquito-borne group. These reports, in turn, raised an interest in speculating the global dispersal of the mosquito-borne flaviviruses originating from Africa. However, this center of origin concept has not been always accepted by all. For example, in a study of the Old World arenaviruses, despite the fact that all members, with the exception of globally distributed lymphocytic choriomeningitis virus, are found only in Africa, the origin of the group was concluded to be elsewhere.29 Similarly, in another analysis of phylogenetic data, it was concluded that the current geographic range of a species was not a reliable indicator of the historical geographic range of the ancestors of the species.30

In one of the pioneering studies to explain the unique pattern of global dispersal of related flaviviruses, Sabin31 speculated on multiple geographic foci involved in dispersal that had developed after the evolution of flaviviruses of different traits from “stem virus” in the unidentified primary focus. His speculation was based on the global dispersal patterns of Aedes-borne viscerotropic and Culex-borne neurotropic flaviviruses. Thus, the current geographic distribution of flaviviruses was explained as the result of further dispersal from the secondary foci. He also raised a possibility of the dispersal of a Culex-borne JEV complex virus into the New World, ultimately evolving to become a genetically distinct virus (SLEV) that still shares neurotropism and Culex-borne mode of transmission.

In considering the geographic origin of the YFV cluster, the complexity of the available information needs to be overcome. The extant members of the yellow fever cluster are found in New Guinea (SEPV), Asia (YOKV, SOKV), and Africa (YFV, ENTV). In the third minor branch of the YFV group, Jugra and Edge Hill viruses are found only in Asia and Australia, respectively, whereas others are found in Africa. To make the analysis more complicated, Wesselsbron virus that is prevalent in Africa has been also isolated from Southeast Asia.32 Thus, it is prudent to take this information into consideration as well as the aforementioned theories on flaviviral dispersal when the geographic origin and subsequent dispersal of the members of the YFV cluster or the mosquito-borne flaviviruses is debated.

Table 1

Comparison of the lengths of genes and genomic regions among the members of the YFV cluster and of amino acid similarities with YFV

Gene or genomic regionSEPVENTVYOKV*
* Data derived from the report by Tajima and others.6 nt, nucleotide; aa, amino acid; %, amino acid similarity with YFV (in parentheses).
5′-NCR116 nt119 nt150 nt
Capsid116 aa (39.7)119 aa (28.3)128 aa (23.2)
PrM164 aa (55.5)168 aa (47.6)168 aa (42.8)
Envelope490 aa (54.3)489 aa (50.3)490 aa (46.6)
NS1353 aa (64.4)353 aa (53.9)353 aa (53.7)
NS2A226 aa (40.9)228 aa (19.5)227 aa (20.8)
NS2B130 aa (50.8)130 aa (31.5)130 aa (30.8)
NS3623 aa (71.7)620 aa (52.8)620 aa (54.0)
NS4A126 aa (51.6)126 aa (28.6)126 aa (33.3)
2K23 aa (47.9)23 aa (60.9)23 aa (52.2)
NS4B248 aa (64.3)249 aa (44.5)249 aa (44.0)
NS5906 aa (68.6)906 aa (59.3)906 aa (59.4)
3′-NCR459 ntIncomplete432 nt
Full genome length10793 nt10857 nt
Table 2

Proposed polyprotein cleavage sites of the members of YFV cluster

VirusCv/CiCi/prMprM/M
YFVSSRKRR/SHDVLTLLMTGG/VTLVRKSRRSRR/AIDLPT
SEPVGRRKRR/SPPASIPLMAYS/ASVTRQSRRSRR/SALITP
ENTVLARKRR/SSATHLLGAACG/IHVERFPRRSRR/SVEITS
YOKVMKRKRR/SSVSCEVTVVGA/LQIGRMPRRNRR/SVALTN
M/EE/NS1NS1/NS2A
YFVVGPAYS/AHCIGISLGVGA/DQGCAIRSWVTA/GEIHAV
SEPVIGPAYS/THCLGIATGVGA/EVGCSLKSWVVA/SKGDVD
ENTVIAPAYS/THCTSIGTGVGA/ETGCAVKSWVSA/ADGRRC
YOKVVAPAYS/THCTNVGTGVGA/EQACAVKSWVSA/GEGRMC
NS2A/NS2BNS2B/NS3NS3/NS4A
YFVRIFGRR/SIPVNEVRGARR/SGDVLWFAEGRR/GAAEVL
SEPVTRIPQR/SWPLGEQKTATR/SGVLWDFAEGRR/SINGLL
ENTVLRTAKR/SMDWTDEYTSRR/SNIMWEYATATR/SMTTIL
YOKVNNGKVR/SIDWTDQYTKQR/SNILWEYATTTR/SITAVI
NS4A/2K2K/NS4BNS4B/NS5
YFVEPGQQR/SIQDNQVSVVAA/NELGMLMETGRR/GRANGK
SEPVEPGTQR/STYDNQILMVTA/NEMGMLAKQTRR/GRAAGV
ENTVDAGLQR/STQDNYVGLVAA/NENGYLVRGNRR/GGGGTS
YOKVDTGMQR/SIQDNYVALIVA/NENGYLAQANRR/GGTGSG
Table 3

Comparison of the presence of tandem repeat sequences within 3′-NCR among the viruses of YFV cluster

VirusRepeatSequence (5′–3′)
* The three tandem repeats (in 5′ → 3′ direction) found by Rice and others.25 The original sequences (RYF 1-3) of Asibi strain are 44, 44, and 50 nucleotides long, respectively, but only the 32–33–nucleotide-long middle segments are shown for the alignment purpose here. The shared 15-nucleotide sequence (5′-AACCGGGATA[T/A/C]AAAC-3′) was later designated “core sequence.”23 The broken line represents a gap artificially created for alignment.
YFVRYF1*AACCGGGATACAAACCACGGGTGGAG–AACCGG
RYF2*AAACCGGGATATAAACCACGGCTGGAG–AACCGG
RYF3*AAACCGGGATAAAAACTACGGATGGAG–AACCGG
SEPVRSEP1AAACCGGGATAAAAACCACGGA––GAG–GACCGG
RSEP2AAACCGGTATACAAACCAAAACAGACAGGACCGG
RSEP3CCGGGGTAAAAAATTTTTAGGGAGCCTCCGC
ENTVAAACCGGAGCCTCCGCTGG–GAAACCAG
YOKVNo repeat sequence
Figure 1.
Figure 1.

A predicted secondary structure of Sepik virus. The folding pattern was obtained by M-fold program of Zuker and others,14 using the first 200 nucleotides in the 5′-NCR and full length of the 3′-NCR. Two arrows (5′ and 3′ ) point to where the 5′- and 3′-termini of the genome are placed next to each other (but not linked). The 5′-NCR sequence is read downward from the 5′ arrow clockwise along the partially double-strand structure; and the 3′-NCR sequence is read upstream counterclockwise from the 3′ arrow. CS, conserved sequence; 3′-CYC, cycling sequence within CS1; 5′-CYC, cycling sequence within capsid gene; RSEP, tandem repeat sequence of SEPV; 3′-LSH, long stable hairpin structure near the 3′-terminal segment of the genome.

Citation: The American Journal of Tropical Medicine and Hygiene Am J Trop Med Hyg 75, 6; 10.4269/ajtmh.2006.75.1165

Figure 2.
Figure 2.

Bootscanning of the complete ORFs of flaviviruses. The entire ORF is scanned across the x-axis. Percentage of permuted trees is shown on the y-axis. Query sequence of YFV is an unmarked horizontal line at 100% permutation across ORF. Peaks above 50% permutation labeled S and E indicate Sepik virus and Entebbe bat virus, respectively. Virus sequences used: Apoi virus, Bagaza virus, cell fusing agent virus, DENV-4, Entebbe bat virus, Kamiti River virus, Kedougou virus, Murray Valley encephalitis virus, Powassan virus, Sepik virus, St Louis encephalitis virus, Tamana bat virus, West Nile virus, Yokose virus, yellow fever virus, and Zika virus.

Citation: The American Journal of Tropical Medicine and Hygiene Am J Trop Med Hyg 75, 6; 10.4269/ajtmh.2006.75.1165

Figure 3.
Figure 3.

Bayesian phylogenetic tree of flaviviruses based on the ORF sequences. The long horizontal distance between the distantly related insect flaviviruses (CFAV and KRV) and the rest of viruses was shortened for visual presentation and, thus, is not on scale. The probabilities of support of all nodes except one are 1.00 (or 100%) and are not shown. The probability < 1.00 of the exceptional node (for IGUV-BSQV-KOKV) is shown on the phylogram. ALKV, Alkhurma virus; APOIV, Apoi virus; BAGV, Bagaza virus; BSQV, Bussuquara virus; CFAV, cell fusing agent virus; DENV, dengue virus; DTV, deer tick virus; ENTV, Entebbe bat virus; IGUV, Iguape virus; ILHV, Ilheus virus; JEV, Japanese encephalitis virus; KEDV, Kedougou virus; KOKV, Kokobera virus; KRV, Kamiti River virus; LGTV, Langat virus; LIV, louping ill virus; MMLV, Montana myotis leukoencephalitis virus; MODV, Modoc virus; MVEV, Murray Valley encephalitis virus; OHFV, Omsk hemorrhagic fever virus; POWV, Powassan virus; RBV, Rio Bravo virus; ROCV, Rocio virus; SEPV, Sepik virus; SLEV, St Louis encephalitis virus; TBEV, tick-borne encephalitis virus; USUV, Usutu virus; WNV, West Nile virus; YFV, yellow fever virus; YOKV, Yokose virus; ZIKV, Zika virus.

Citation: The American Journal of Tropical Medicine and Hygiene Am J Trop Med Hyg 75, 6; 10.4269/ajtmh.2006.75.1165

*

Address correspondence to Goro Kuno, PO Box 2087, Fort Collins, CO 80522-2087. E-mail: gok1@cdc.gov

Authors’ address: Goro Kuno and Gwong-Jen J. Chang, PO Box 2087, Fort Collins, CO 80522-2087, Telephone: 970-221-6431, Fax: 970 266-3599, E-mail: gok1@cdc.gov.

Acknowledgment We thank S. Vander Vliet for technical assistance.

REFERENCES

  • 1

    International Committee on the Taxonomy of Viruses (ICTV), 2005. Virus Taxonomy: Classification and Nomenclature of Viruses. San Diego: Elsevier.

  • 2

    Strode GE, 1951. Yellow Fever. New York: McGraw-Hill Book Co.

  • 3

    Dennis LH, Reisberg BE, Crosbie J, Crozier D, Conrad ME, 1969. The original haemorrhagic fever: Yellow fever. Br J Haematol 17 :455–462.

    • Search Google Scholar
    • Export Citation
  • 4

    Calisher CH, Karabatsos K, Dalrymple JM, Shope RE, Porter-field JS, Westaway EG, Brandt WE, 1989. Antigenic relationship between flaviviruses as determined by cross-neutralization tests with polyclonal antisera. J Gen Virol 70 :37–43.

    • Search Google Scholar
    • Export Citation
  • 5

    Kuno G, Chang G-JJ, Tsuchiya KR, Karabatsos N, Cropp CB, 1998. Phylogeny of the genus Flavivirus. J Virol 72 :73–83.

  • 6

    Tajima S, Takasaki T, Matsuno S, Nakayama M, Kurane I, 2005. Genetic characterization of Yokose virus, a flavivirus isolated from bat in Japan. Virology 332 :38–44.

    • Search Google Scholar
    • Export Citation
  • 7

    Pierre V, Drouet M-T, Deubel V, 1994. Identification of mosquito-borne flavivirus sequences using universal primers and reverse transcription/polymerase chain reaction. Res Virol 145 :179–188.

    • Search Google Scholar
    • Export Citation
  • 8

    Chang G-JJ, 1997. Molecular biology of dengue viruses. Gubler DJ, Kuno G, eds. Dengue and Dengue Hemorrhagic Fever. Wallingford, UK: CAB International, 175–198.

  • 9

    Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DJ, 1997. The Clustal X window interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res 25 :4876–4882.

    • Search Google Scholar
    • Export Citation
  • 10

    Hall TA, 2001. BioEdit. Raleigh, NC: Department of Microbiology, North Carolina State University.

  • 11

    Nicholas KB, Nicholas HB, Deerfield DW, 1997. GeneDoc: Analysis and visualization of genetic variation. EMBONet News 4 :14–18.

  • 12

    Rice CM, Strauss JH, 1990. Production of flavivirus polypeptides by proteolytic processing. Seminar Virol 1 :357–367.

  • 13

    Chang G-JJ, Hunt AR, Davis B, 2000. A single intramuscular injection of recombinant plasmid DNA induces protective immunity and prevents Japanese encephalitis in mice. J Virol 74 :4244–4252.

    • Search Google Scholar
    • Export Citation
  • 14

    Zuker M, Mathews DH, Turner DH, 1999. Algorithms and thermodynamics for RNA secondary structure prediction: a practical guide. Barciszewski J, Clark BFC, eds. RNA biochemistry and biotechnology. Amsterdam: Kluwer Academic Publishers, 11–43.

  • 15

    Khromykh AA, Meka H, Guyatt KJ, Westaway EG, 2001. Essential role of cyclization sequences in flavivirus RNA replication. J Virol 75 :6719–6728.

    • Search Google Scholar
    • Export Citation
  • 16

    Ray SC, 1999. Simplot. Baltimore: Johns Hopkins University.

  • 17

    Salminen MO, Carr JK, Burke DS, McCutchan FE, 1995. Identification of breakpoints in intergenotypic recombinants of HIV type 1 by bootscanning. AIDS Res Hum Retroviruses 11 :1423–1425.

    • Search Google Scholar
    • Export Citation
  • 18

    Felsenstein J, 1995. PHYLIP, Version 3.57c. Seattle, WA: Department of Genetics, University of Washington.

  • 19

    Kuno G, Chang G-JJ, 2005. Biological transmission of arboviruses: reexamination of and new insights into components, mechanisms, and unique traits as well as their evolutionary trends. Clin Microbiol Rev 18 :608–637.

    • Search Google Scholar
    • Export Citation
  • 20

    Huelsenbeck JP, Ronquist F, 2001. MrBayes: Bayesian inference phylogenetic trees. Bioinformatics 17 :754–755.

  • 21

    Page RDM, 1996. Treeview: an application to display phylogenetic trees on personal computer. CABIOS 12 :357–358.

  • 22

    Hahn CS, Hahn YS, Rice CM, Lee E, Dalgarno L, Strauss EG, Strauss JH, 1987. Conserved elements in the 3′ untranslated region of flavivirus RNAs and potential cyclization sequences. J Mol Biol 198 :33–41.

    • Search Google Scholar
    • Export Citation
  • 23

    Mutebi J-P, Rijnbrand RCA, Wang H, Ryman KD, Wang E, Fulop LD, Titball R, Barrett ADT, 2004. Genetic relationships and evolution of genotypes of yellow fever virus and other members of the yellow fever virus group within the Flavivirus genus based on the 3′ noncoding region. J Virol 78 :9652–9665.

    • Search Google Scholar
    • Export Citation
  • 24

    Proutski V, Gould EA, Holmes EC, 1997. Secondary structure of the 3′ untranslated region of flaviviruses: Similarities and differences. Nucleic Acids Res 25 :1194–1202.

    • Search Google Scholar
    • Export Citation
  • 25

    Rice CM, Lenches EM, Eddy SR, Shin SJ, Sheets RL, Strauss JH, 1985. Nucleotide sequence of yellow fever virus: implications for flavivirus gene expression and evolution. Science 229 :726–733.

    • Search Google Scholar
    • Export Citation
  • 26

    Wang E, Weaver SC, Shope RE, Tesh RB, Watts DM, Barrett ADT, 1996. Genetic variation in yellow fever virus: duplication in the 3′ noncoding region of strains from Africa. Virology 225 :274–281.

    • Search Google Scholar
    • Export Citation
  • 27

    Varelas-Wesley I, Calisher CH, 1982. Antigenic relationships of flaviviruses with undetermined arthropod-borne status. Am J Trop Med Hyg 31 :1273–1284.

    • Search Google Scholar
    • Export Citation
  • 28

    Tabachnick WJ, 1991. Evolutionary genetics and arthropod-borne diseases. The yellow fever mosquito. Am Entomol 37 :14–24.

  • 29

    Hugot JP, Gonzalez JP, Denys C, 2001. Evolution of the Old World Arenaviridae and their rodent hosts: Generalized host-transfer or association by descent? Infect Genet Evol 1 :13–20.

    • Search Google Scholar
    • Export Citation
  • 30

    Losos JB, Glor RE, 2003. Phylogenetic comparative methods and the geography of speciation. Trends Ecol Evol 18 :220–227.

  • 31

    Sabin AB, 1959. Survey of knowledge and problems in field of arthropod- borne virus infections. Arch Virusforsch 9 :1–10.

  • 32

    Karabatsos N, 1985. International Catalogue of Arboviruses. Third edition. San Antonio, TX: American Society of Tropical Medicine and Hygiene.

Save