Volume 95, Issue 3
  • ISSN: 0002-9637
  • E-ISSN: 1476-1645



is among the most dangerous human pathogens, and systematic research of this pathogen is important in bacterial pathogenomics research. To fully interpret the biological functions, physiological characteristics, and pathogenesis of , a comprehensive annotation of its entire genome is necessary. The emergence of omics-based research has brought new opportunities to better annotate the genome of this pathogen. Here, the complete genome of strain 91001 was reannotated using genomics and proteogenomics data. One hundred and thirty-seven unreliable coding sequences were removed, and 41 homologous genes were relocated with their translational initiation sites, while the functions of seven pseudogenes and 392 hypothetical genes were revised. Moreover, annotations of noncoding RNAs, repeat sequences, and transposable elements have also been incorporated. The reannotated results are freely available at http://tody.bmi.ac.cn.


Article metrics loading...

The graphs shown below represent data from March 2017
Loading full text...

Full text loading...



  1. Brubaker RR, Sussman M, , 2002. Yersinia pestis. , ed. Molecular Medical Microbiology. London, United Kingdom: Academic Press. [Google Scholar]
  2. WHO, 2015. Plague: Disease Outbreak News. Available at: http://www.who.int/csr/don/archive/disease/plague/en/. Accessed May, 2015. [Google Scholar]
  3. Parkhill JWB, Thomson NR, Titball RW, Holden MT, Prentice MB, Sebaihia M, James KD, Churcher C, Mungall KL, Baker S, Basham D, Bentley SD, Brooks K, Cerdeño-Tárraga AM, Chillingworth T, Cronin A, Davies RM, Davis P, Dougan G, Feltwell T, Hamlin N, Holroyd S, Jagels K, Karlyshev AV, Leather S, Moule S, Oyston PC, Quail M, Rutherford K, Simmonds M, Skelton J, Stevens K, Whitehead S, Barrell BG, , 2001. Genome sequence of Yersinia pestis, the causative agent of plague. Nature 413: 523527.[Crossref] [Google Scholar]
  4. Song Y, Tong Z, Wang J, Wang L, Guo Z, Han Y, Zhang J, Pei D, Zhou D, Qin H, Pang X, Han Y, Zhai J, Li M, Cui B, Qi Z, Jin L, Dai R, Chen F, Li S, Ye C, Du Z, Lin W, Wang J, Yu J, Yang H, Wang J, Huang P, Yang R, . 2004. Complete genome sequence of Yersinia pestis strain 91001, an isolate avirulent to humans. DNA Res 11: 179197.[Crossref] [Google Scholar]
  5. Jaffe JD, Berg HC, Church GM, , 2004. Proteogenomic mapping as a complementary method to perform genome annotation. Proteomics 4: 5977.[Crossref] [Google Scholar]
  6. Ouzounis CA, Karp PD, , 2002. The past, present and future of genome-wide re-annotation. Genome Biol 3: comment2001.1comment2001.6.[Crossref] [Google Scholar]
  7. Camus JC, Pryor MJ, Médigue C, Cole ST, , 2002. Re-annotation of the genome sequence of Mycobacterium tuberculosis H37Rv. Microbiology 148: 29672973.[Crossref] [Google Scholar]
  8. Gundogdu O, Bentley SD, Holden MT, Parkhill J, Dorrell N, Wren BW, , 2007. Re-annotation and re-analysis of the Campylobacter jejuni NCTC11168 genome sequence. BMC Genomics 8: 162.[Crossref] [Google Scholar]
  9. Guo FB, Xiong L, Teng JL, Yuen KY, Lau SK, Woo PC, , 2013. Re-annotation of protein-coding genes in 10 complete genomes of Neisseriaceae family by combining similarity-based and composition-based methods. DNA Res 20: 273286.[Crossref] [Google Scholar]
  10. Powell S, Forslund K, Szklarczyk D, Trachana K, Roth A, Huerta-Cepas J, Gabaldon T, Rattei T, Creevey C, Kuhn M, Jensen LJ, von Mering C, Bork P, , 2014. eggNOG v4.0: nested orthology inference across 3686 organisms. Nucleic Acids Res 42: D231D239.[Crossref] [Google Scholar]
  11. Salgado H, Peralta-Gil M, Gama-Castro S, Santos-Zavaleta A, Muniz-Rascado L, Garcia-Sotelo JS, Weiss V, Solano-Lira H, Martinez-Flores I, Medina-Rivera A, Salgado-Osorio G, Alquicira-Hernandez S, Alquicira-Hernandez K, Lopez-Fuentes A, Porron-Sotelo L, Huerta AM, Bonavides-Martinez C, Balderas-Martinez YI, Pannier L, Olvera M, Labastida A, Jimenez-Jacinto V, Vega-Alvarado L, Del Moral-Chavez V, Hernandez-Alvarez A, Morett E, Collado-Vides J, , 2013. RegulonDB v8.0: omics data sets, evolutionary conservation, regulatory phrases, cross-validated gold standards and more. Nucleic Acids Res 41: D203D213.[Crossref] [Google Scholar]
  12. Narsai R, Devenish J, Castleden I, Narsai K, Xu L, Shou H, Whelan J, . 2013. Rice DB: an Oryza Information Portal linking annotation, subcellular location, function, expression, regulation, and evolutionary information for rice and Arabidopsis. Plant J 76: 10571073.[Crossref] [Google Scholar]
  13. Sass S, Buettner F, Mueller NS, Theis FJ, , 2015. RAMONA: a Web application for gene set analysis on multilevel omics data. Bioinformatics 31: 128130.[Crossref] [Google Scholar]
  14. Fisch KM, Meissner T, Gioia L, Ducom JC, Carland TM, Loguercio S, Su AI, , 2015. Omics Pipe: a community-based framework for reproducible multi-omics data analysis. Bioinformatics 31: 17241728.[Crossref] [Google Scholar]
  15. Peterson ES, McCue LA, Schrimpe-Rutledge AC, Jensen JL, Walker H, Kobold MA, Webb SR, Payne SH, Ansong C, Adkins JN, Cannon WR, Webb-Robertson BJ, , 2012. VESPA: software to facilitate genomic annotation of prokaryotic organisms through integration of proteomic and transcriptomic data. BMC Genomics 13: 131.[Crossref] [Google Scholar]
  16. Schrimpe-Rutledge AC, Jones MB, Chauhan S, Purvine SO, Sanford JA, Monroe ME, Brewer HM, Payne SH, Ansong C, Frank BC, Smith RD, Peterson SN, Motin VL, Adkins JN, , 2012. Comparative omics-driven genome annotation refinement: application across Yersiniae. PLoS One 7: e33903.[Crossref] [Google Scholar]
  17. Payne SH, Huang ST, Pieper R, , 2010. A proteogenomic update to Yersinia: enhancing genome annotation. BMC Genomics 11: 460. [Google Scholar]
  18. Yan Y, Su S, Meng X, Ji X, Qu Y, Liu Z, Wang X, Cui Y, Deng Z, Zhou D, Jiang W, Yang R, Han Y, , 2013. Determination of sRNA expressions by RNA-seq in Yersinia pestis grown in vitro and during infection. PLoS One 8: e74495.[Crossref] [Google Scholar]
  19. Zhou L, Ying W, Han Y, Chen M, Yan Y, Li L, Zhu Z, Zheng Z, Jia W, Yang R, Qian X, , 2012. A proteome reference map and virulence factors analysis of Yersinia pestis 91001. J Proteomics 75: 894907.[Crossref] [Google Scholar]
  20. Lerat E, Ochman H, , 2005. Recognizing the pseudogenes in bacterial genomes. Nucleic Acids Res 33: 31253132.[Crossref] [Google Scholar]
  21. Kanehisa M, Sato Y, Kawashima M, Furumichi M, Tanabe M, , 2016. KEGG as a reference resource for gene and protein annotation. Nucleic Acids Res 44: D457D462.[Crossref] [Google Scholar]
  22. Eddy SR, , 2001. Non-coding RNA genes and the modern RNA world. Nat Rev Genet 2: 919929.[Crossref] [Google Scholar]
  23. Wittkopp PJ, Kalay G, , 2012. Cis-regulatory elements: molecular mechanisms and evolutionary processes underlying divergence. Nat Rev Genet 13: 5969.[Crossref] [Google Scholar]
  24. Eppinger M, Worsham PL, Nikolich MP, Riley DR, Sebastian Y, Mou S, Achtman M, Lindler LE, Ravel J, , 2010. Genome sequence of the deep-rooted Yersinia pestis strain Angola reveals new insights into the evolution and pangenome of the plague bacterium. J Bacteriol 192: 16851699.[Crossref] [Google Scholar]
  25. Eppinger M, Rosovitz MJ, Fricke WF, Rasko D, Kokorina G, Fayolle C, Lindler LE, Carniel E, Ravel J, . 2007. The complete genome sequence of Yersinia pseudotuberculosis IP31758, the causative agent of Far East scarlet like fever. PLoS Genet 3: e142.[Crossref] [Google Scholar]
  26. Li Y, Cui Y, Cui B, Yan Y, Yang X, Wang H, Qi Z, Zhang Q, Xiao X, Guo Z, Ma C, Wang J, Song Y, Yang R, , 2013. Features of variable number of tandem repeats in Yersinia pestis and the development of a hierarchical genotyping scheme. PLoS One 8: e66567.[Crossref] [Google Scholar]
  27. Pourcel C, Salvignol G, Vergnaud G, , 2005. CRISPR elements in Yersinia pestis acquire new repeats by preferential uptake of bacteriophage DNA, and provide additional tools for evolutionary studies. Microbiology 151: 653663.[Crossref] [Google Scholar]
  28. Cui Y, Li Y, Gorge O, Platonov ME, Yan Y, Guo Z, Pourcel C, Dentovskaya SV, Balakhonov SV, Wang X, Song Y, Anisimov AP, Vergnaud G, Yang R, , 2008. Insight into microevolution of Yersinia pestis by clustered regularly interspaced short palindromic repeats. PLoS One 3: e2652.[Crossref] [Google Scholar]
  29. Langille MG, Brinkman FS, , 2009. IslandViewer: an integrated interface for computational identification and visualization of genomic islands. Bioinformatics 25: 664665.[Crossref] [Google Scholar]
  30. Aebersold R, Mann M, , 2003. Mass spectrometry-based proteomics. Nature 422: 198207.[Crossref] [Google Scholar]
  31. Ong SE, Blagoev B, Kratchmarova I, Kristensen DB, Steen H, Pandey A, Mann M, , 2002. Stable isotope labeling by amino acids in cell culture, SILAC, as a simple and accurate approach to expression proteomics. Mol Cell Proteomics 1: 376386.[Crossref] [Google Scholar]
  32. Zhou D, Han Y, Qiu J, Qin L, Guo Z, Wang X, Song Y, Tan Y, Du Z, Yang R, , 2006. Genome-wide transcriptional response of Yersinia pestis to stressful conditions simulating phagolysosomal environments. Microbes Infect 8: 26692678.[Crossref] [Google Scholar]
  33. Beauregard A, Smith EA, Petrone BL, Singh N, Karch C, McDonough KA, Wade JT, , 2013. Identification and characterization of small RNAs in Yersinia pestis . RNA Biol 10: 397405.[Crossref] [Google Scholar]
  34. Koo JT, Alleyne TM, Schiano CA, Jafari N, Lathem WW, , 2011. Global discovery of small RNAs in Yersinia pseudotuberculosis identifies Yersinia-specific small, noncoding RNAs required for virulence. Proc Natl Acad Sci USA 108: E709E717.[Crossref] [Google Scholar]
  35. Delcher AL, Bratke KA, Powers EC, Salzberg SL, , 2007. Identifying bacterial genes and endosymbiont DNA with Glimmer. Bioinformatics 23: 673679.[Crossref] [Google Scholar]
  36. Hyatt D, Chen GL, Locascio PF, Land ML, Larimer FW, Hauser LJ, , 2010. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11: 119.[Crossref] [Google Scholar]
  37. Besemer JLA, Borodovsky M, , 2001. GeneMarkS—a self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions. Nucleic Acids Res 29: 26072618.[Crossref] [Google Scholar]
  38. Gene Ontology Consortium Blake JA, Dolan M, Drabkin H, Hill DP, Li N, Sitnikov D, Bridges S, Burgess S, Buza T, McCarthy F, Peddinti D, Pillai L, Carbon S, Dietze H, Ireland A, Lewis SE, Mungall CJ, Gaudet P, Chrisholm RL, Fey P, Kibbe WA, Basu S, Siegele DA, McIntosh BK, Renfro DP, Zweifel AE, Hu JC, Brown NH, Tweedie S, Alam-Faruque Y, Apweiler R, Auchinchloss A, Axelsen K, Bely B, Blatter M, Bonilla C, Bouguerleret L, Boutet E, Breuza L, Bridge A, Chan WM, Chavali G, Coudert E, Dimmer E, Estreicher A, Famiglietti L, Feuermann M, Gos A, Gruaz-Gumowski N, Hieta R, Hinz C, Hulo C, Huntley R, James J, Jungo F, Keller G, Laiho K, Legge D, Lemercier P, Lieberherr D, Magrane M, Martin MJ, Masson P, Mutowo-Muellenet P, O'Donovan C, Pedruzzi I, Pichler K, Poggioli D, Porras Millan P, Poux S, Rivoire C, Roechert B, Sawford T, Schneider M, Stutz A, Sundaram S, Tognolli M, Xenarios I, Foulgar R, Lomax J, Roncaglia P, Khodiyar VK, Lovering RC, Talmud PJ, Chibucos M, Giglio MG, Chang H, Hunter S, McAnulla C, Mitchell A, Sangrador A, Stephan R, Harris MA, Oliver SG, Rutherford K, Wood V, Bahler J, Lock A, Kersey PJ, McDowall DM, Staines DM, Dwinell M, Shimoyama M, Laulederkind S, Hayman T, Wang S, Petri V, Lowry T, D'Eustachio P, Matthews L, Balakrishnan R, Binkley G, Cherry JM, Costanzo MC, Dwight SS, Engel SR, Fisk DG, Hitz BC, Hong EL, Karra K, Miyasato SR, Nash RS, Park J, Skrzypek MS, Weng S, Wong ED, Berardini TZ, Huala E, Mi H, Thomas PD, Chan J, Kishore R, Sternberg P, Van Auken K, Howe D, Westerfield M, , , 2013. Gene Ontology annotations and resources. Nucleic Acids Res 41: D530D535.[Crossref] [Google Scholar]
  39. Quevillon E, Silventoinen V, Pillai S, Harte N, Mulder N, Apweiler R, Lopez R, , 2005. InterProScan: protein domains identifier. Nucleic Acids Res 33: 116120.[Crossref] [Google Scholar]
  40. Finn RD, Clements ABJ, Punta M, , 2014. The Pfam protein families database. Nucleic Acids Res 40: D222D230.[Crossref] [Google Scholar]
  41. Mi H, Poudel S, Muruganujan A, Casagrande JT, Thomas PD, , 2016. PANTHER version 10: expanded protein families and functions, and analysis tools. Nucleic Acids Res 44: D336D342.[Crossref] [Google Scholar]
  42. Haft DH, Selengut JD, Richter RA, Harkins D, Basu MK, Beck E, , 2013. TIGRFAMs and genome properties in 2013. Nucleic Acids Res 41: D387D395.[Crossref] [Google Scholar]
  43. Pedruzzi I, Rivoire C, Auchincloss AH, Coudert E, Keller G, de Castro E, Baratin D, Cuche BA, Bougueleret L, Poux S, Redaschi N, Xenarios I, Bridge A, , 2015. HAMAP in 2015: updates to the protein family classification and annotation system. Nucleic Acids Res 43: D1064D1070.[Crossref] [Google Scholar]
  44. Sigrist CJ, de Castro E, Cerutti L, Cuche BA, Hulo N, Bridge A, Bougueleret L, Xenarios I, , 2013. New and continuing developments at PROSITE. Nucleic Acids Res 41: D344D347.[Crossref] [Google Scholar]
  45. Wilson D, Madera M, Vogel C, Chothia C, Gough J, , 2007. The SUPERFAMILY database in 2007: families and functions. Nucleic Acids Res 35: D308D313.[Crossref] [Google Scholar]
  46. Attwood TK, Coletta A, Muirhead G, Pavlopoulou A, Philippou PB, Popov I, Roma-Mateo C, Theodosiou A, Mitchell AL, , 2012. The PRINTS database: a fine-grained protein sequence annotation and analysis resource—its status in 2012. Database 2012: bas019.[Crossref] [Google Scholar]
  47. Lees JG, Lee D, Studer RA, Dawson NL, Sillitoe I, Das S, Yeats C, Dessailly BH, Rentzsch R, Orengo CA, , 2014. Gene3D: multi-domain annotations for protein sequence and comparative genome analysis. Nucleic Acids Res 42: D240D245.[Crossref] [Google Scholar]
  48. Bru C, Courcelle E, Carrere S, Beausse Y, Dalmar S, Kahn D, , 2005. The ProDom database of protein domain families: more emphasis on 3D. Nucleic Acids Res 33: D212D215.[Crossref] [Google Scholar]
  49. Letunic I, Doerks T, Bork P, , 2009. SMART 6: recent updates and new developments. Nucleic Acids Res 37: D229D232.[Crossref] [Google Scholar]
  50. Lupas AVDM, Stock J, , 1991. Predicting coiled coils from protein sequences. Science 252: 11621164.[Crossref] [Google Scholar]
  51. Burge SW, Daub J, Eberhardt R, Tate J, Barquist L, Nawrocki EP, Eddy SR, Gardner PP, Bateman A, , 2013. Rfam 11.0: 10 years of RNA families. Nucleic Acids Res 41: D226D232.[Crossref] [Google Scholar]
  52. Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL, , 2009. BLAST+: architecture and applications. BMC Bioinformatics 10: 421.[Crossref] [Google Scholar]
  53. Temple S, , 2012. Using and understanding RepeatMasker. Methods Mol Biol 859: 2951.[Crossref] [Google Scholar]
  54. Benson G, , 1999. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res 27: 573580.[Crossref] [Google Scholar]
  55. Xu Z, Wang H, , 2007. LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res 35: W265W268.[Crossref] [Google Scholar]
  56. Siguier P, Perochon J, Lestrade L, Mahillon J, Chandler M, , 2006. ISfinder: the reference centre for bacterial insertion sequences. Nucleic Acids Res 34: D32D36.[Crossref] [Google Scholar]
  57. Grissa I, Vergnaud G, Pourcel C, , 2007. CRISPRFinder: a web tool to identify clustered regularly interspaced short palindromic repeats. Nucleic Acids Res 35: W52W57.[Crossref] [Google Scholar]
  58. Zhou Y, Liang Y, Lynch KH, Dennis JJ, Wishart DS, , 2011. PHAST: a fast phage search tool. Nucleic Acids Res 39: W347W352.[Crossref] [Google Scholar]
  59. Ping L, Zhang H, Zhai Dammer EB, Duong DM, Li N, Yan Z, Wu J, Xu P, , 2013. Quantitative proteomics reveals significant changes in cell shape and an energy shift after IPTG induction via an optimized SILAC. J Proteome Res 12: 59785988.[Crossref] [Google Scholar]
  60. Elias JE, Gygi SP, , 2007. Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nat Methods 4: 207214.[Crossref] [Google Scholar]
  61. Salzberg SL, Delcher AL, Kasif S, White O, , 1998. Microbial gene identification using interpolated Markov models. Nucleic Acids Res 26: 544548.[Crossref] [Google Scholar]
  62. Delcher AL, Bratke KA, Powers EC, Salzberg SL, , 2007. Identifying bacterial genes and endosymbiont DNA with Glimmer. Bioinformatics 23: 673679.[Crossref] [Google Scholar]
  63. Richardson EJ, Watson M, , 2012. The automatic annotation of bacterial genomes. Brief Bioinform 14: 112.[Crossref] [Google Scholar]

Data & Media loading...

Supplementary PDF

  • Received : 16 Mar 2016
  • Accepted : 17 May 2016
  • Published online : 07 Sep 2016

Most Cited This Month

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error