Volume 95, Issue 3
  • ISSN: 0002-9637
  • E-ISSN: 1476-1645



is among the most dangerous human pathogens, and systematic research of this pathogen is important in bacterial pathogenomics research. To fully interpret the biological functions, physiological characteristics, and pathogenesis of , a comprehensive annotation of its entire genome is necessary. The emergence of omics-based research has brought new opportunities to better annotate the genome of this pathogen. Here, the complete genome of strain 91001 was reannotated using genomics and proteogenomics data. One hundred and thirty-seven unreliable coding sequences were removed, and 41 homologous genes were relocated with their translational initiation sites, while the functions of seven pseudogenes and 392 hypothetical genes were revised. Moreover, annotations of noncoding RNAs, repeat sequences, and transposable elements have also been incorporated. The reannotated results are freely available at http://tody.bmi.ac.cn.


Article metrics loading...

Loading full text...

Full text loading...



  1. Brubaker RR, Sussman M, , 2002. Yersinia pestis. , ed. Molecular Medical Microbiology. London, United Kingdom: Academic Press.
  2. WHO, 2015. Plague: Disease Outbreak News. Available at: http://www.who.int/csr/don/archive/disease/plague/en/. Accessed May, 2015.
  3. Parkhill JWB, Thomson NR, Titball RW, Holden MT, Prentice MB, Sebaihia M, James KD, Churcher C, Mungall KL, Baker S, Basham D, Bentley SD, Brooks K, Cerdeño-Tárraga AM, Chillingworth T, Cronin A, Davies RM, Davis P, Dougan G, Feltwell T, Hamlin N, Holroyd S, Jagels K, Karlyshev AV, Leather S, Moule S, Oyston PC, Quail M, Rutherford K, Simmonds M, Skelton J, Stevens K, Whitehead S, Barrell BG, , 2001. Genome sequence of Yersinia pestis, the causative agent of plague. Nature 413: 523527.[Crossref]
  4. Song Y, Tong Z, Wang J, Wang L, Guo Z, Han Y, Zhang J, Pei D, Zhou D, Qin H, Pang X, Han Y, Zhai J, Li M, Cui B, Qi Z, Jin L, Dai R, Chen F, Li S, Ye C, Du Z, Lin W, Wang J, Yu J, Yang H, Wang J, Huang P, Yang R, . 2004. Complete genome sequence of Yersinia pestis strain 91001, an isolate avirulent to humans. DNA Res 11: 179197.[Crossref]
  5. Jaffe JD, Berg HC, Church GM, , 2004. Proteogenomic mapping as a complementary method to perform genome annotation. Proteomics 4: 5977.[Crossref]
  6. Ouzounis CA, Karp PD, , 2002. The past, present and future of genome-wide re-annotation. Genome Biol 3: comment2001.1comment2001.6.[Crossref]
  7. Camus JC, Pryor MJ, Médigue C, Cole ST, , 2002. Re-annotation of the genome sequence of Mycobacterium tuberculosis H37Rv. Microbiology 148: 29672973.[Crossref]
  8. Gundogdu O, Bentley SD, Holden MT, Parkhill J, Dorrell N, Wren BW, , 2007. Re-annotation and re-analysis of the Campylobacter jejuni NCTC11168 genome sequence. BMC Genomics 8: 162.[Crossref]
  9. Guo FB, Xiong L, Teng JL, Yuen KY, Lau SK, Woo PC, , 2013. Re-annotation of protein-coding genes in 10 complete genomes of Neisseriaceae family by combining similarity-based and composition-based methods. DNA Res 20: 273286.[Crossref]
  10. Powell S, Forslund K, Szklarczyk D, Trachana K, Roth A, Huerta-Cepas J, Gabaldon T, Rattei T, Creevey C, Kuhn M, Jensen LJ, von Mering C, Bork P, , 2014. eggNOG v4.0: nested orthology inference across 3686 organisms. Nucleic Acids Res 42: D231D239.[Crossref]
  11. Salgado H, Peralta-Gil M, Gama-Castro S, Santos-Zavaleta A, Muniz-Rascado L, Garcia-Sotelo JS, Weiss V, Solano-Lira H, Martinez-Flores I, Medina-Rivera A, Salgado-Osorio G, Alquicira-Hernandez S, Alquicira-Hernandez K, Lopez-Fuentes A, Porron-Sotelo L, Huerta AM, Bonavides-Martinez C, Balderas-Martinez YI, Pannier L, Olvera M, Labastida A, Jimenez-Jacinto V, Vega-Alvarado L, Del Moral-Chavez V, Hernandez-Alvarez A, Morett E, Collado-Vides J, , 2013. RegulonDB v8.0: omics data sets, evolutionary conservation, regulatory phrases, cross-validated gold standards and more. Nucleic Acids Res 41: D203D213.[Crossref]
  12. Narsai R, Devenish J, Castleden I, Narsai K, Xu L, Shou H, Whelan J, . 2013. Rice DB: an Oryza Information Portal linking annotation, subcellular location, function, expression, regulation, and evolutionary information for rice and Arabidopsis. Plant J 76: 10571073.[Crossref]
  13. Sass S, Buettner F, Mueller NS, Theis FJ, , 2015. RAMONA: a Web application for gene set analysis on multilevel omics data. Bioinformatics 31: 128130.[Crossref]
  14. Fisch KM, Meissner T, Gioia L, Ducom JC, Carland TM, Loguercio S, Su AI, , 2015. Omics Pipe: a community-based framework for reproducible multi-omics data analysis. Bioinformatics 31: 17241728.[Crossref]
  15. Peterson ES, McCue LA, Schrimpe-Rutledge AC, Jensen JL, Walker H, Kobold MA, Webb SR, Payne SH, Ansong C, Adkins JN, Cannon WR, Webb-Robertson BJ, , 2012. VESPA: software to facilitate genomic annotation of prokaryotic organisms through integration of proteomic and transcriptomic data. BMC Genomics 13: 131.[Crossref]
  16. Schrimpe-Rutledge AC, Jones MB, Chauhan S, Purvine SO, Sanford JA, Monroe ME, Brewer HM, Payne SH, Ansong C, Frank BC, Smith RD, Peterson SN, Motin VL, Adkins JN, , 2012. Comparative omics-driven genome annotation refinement: application across Yersiniae. PLoS One 7: e33903.[Crossref]
  17. Payne SH, Huang ST, Pieper R, , 2010. A proteogenomic update to Yersinia: enhancing genome annotation. BMC Genomics 11: 460.
  18. Yan Y, Su S, Meng X, Ji X, Qu Y, Liu Z, Wang X, Cui Y, Deng Z, Zhou D, Jiang W, Yang R, Han Y, , 2013. Determination of sRNA expressions by RNA-seq in Yersinia pestis grown in vitro and during infection. PLoS One 8: e74495.[Crossref]
  19. Zhou L, Ying W, Han Y, Chen M, Yan Y, Li L, Zhu Z, Zheng Z, Jia W, Yang R, Qian X, , 2012. A proteome reference map and virulence factors analysis of Yersinia pestis 91001. J Proteomics 75: 894907.[Crossref]
  20. Lerat E, Ochman H, , 2005. Recognizing the pseudogenes in bacterial genomes. Nucleic Acids Res 33: 31253132.[Crossref]
  21. Kanehisa M, Sato Y, Kawashima M, Furumichi M, Tanabe M, , 2016. KEGG as a reference resource for gene and protein annotation. Nucleic Acids Res 44: D457D462.[Crossref]
  22. Eddy SR, , 2001. Non-coding RNA genes and the modern RNA world. Nat Rev Genet 2: 919929.[Crossref]
  23. Wittkopp PJ, Kalay G, , 2012. Cis-regulatory elements: molecular mechanisms and evolutionary processes underlying divergence. Nat Rev Genet 13: 5969.[Crossref]
  24. Eppinger M, Worsham PL, Nikolich MP, Riley DR, Sebastian Y, Mou S, Achtman M, Lindler LE, Ravel J, , 2010. Genome sequence of the deep-rooted Yersinia pestis strain Angola reveals new insights into the evolution and pangenome of the plague bacterium. J Bacteriol 192: 16851699.[Crossref]
  25. Eppinger M, Rosovitz MJ, Fricke WF, Rasko D, Kokorina G, Fayolle C, Lindler LE, Carniel E, Ravel J, . 2007. The complete genome sequence of Yersinia pseudotuberculosis IP31758, the causative agent of Far East scarlet like fever. PLoS Genet 3: e142.[Crossref]
  26. Li Y, Cui Y, Cui B, Yan Y, Yang X, Wang H, Qi Z, Zhang Q, Xiao X, Guo Z, Ma C, Wang J, Song Y, Yang R, , 2013. Features of variable number of tandem repeats in Yersinia pestis and the development of a hierarchical genotyping scheme. PLoS One 8: e66567.[Crossref]
  27. Pourcel C, Salvignol G, Vergnaud G, , 2005. CRISPR elements in Yersinia pestis acquire new repeats by preferential uptake of bacteriophage DNA, and provide additional tools for evolutionary studies. Microbiology 151: 653663.[Crossref]
  28. Cui Y, Li Y, Gorge O, Platonov ME, Yan Y, Guo Z, Pourcel C, Dentovskaya SV, Balakhonov SV, Wang X, Song Y, Anisimov AP, Vergnaud G, Yang R, , 2008. Insight into microevolution of Yersinia pestis by clustered regularly interspaced short palindromic repeats. PLoS One 3: e2652.[Crossref]
  29. Langille MG, Brinkman FS, , 2009. IslandViewer: an integrated interface for computational identification and visualization of genomic islands. Bioinformatics 25: 664665.[Crossref]
  30. Aebersold R, Mann M, , 2003. Mass spectrometry-based proteomics. Nature 422: 198207.[Crossref]
  31. Ong SE, Blagoev B, Kratchmarova I, Kristensen DB, Steen H, Pandey A, Mann M, , 2002. Stable isotope labeling by amino acids in cell culture, SILAC, as a simple and accurate approach to expression proteomics. Mol Cell Proteomics 1: 376386.[Crossref]
  32. Zhou D, Han Y, Qiu J, Qin L, Guo Z, Wang X, Song Y, Tan Y, Du Z, Yang R, , 2006. Genome-wide transcriptional response of Yersinia pestis to stressful conditions simulating phagolysosomal environments. Microbes Infect 8: 26692678.[Crossref]
  33. Beauregard A, Smith EA, Petrone BL, Singh N, Karch C, McDonough KA, Wade JT, , 2013. Identification and characterization of small RNAs in Yersinia pestis . RNA Biol 10: 397405.[Crossref]
  34. Koo JT, Alleyne TM, Schiano CA, Jafari N, Lathem WW, , 2011. Global discovery of small RNAs in Yersinia pseudotuberculosis identifies Yersinia-specific small, noncoding RNAs required for virulence. Proc Natl Acad Sci USA 108: E709E717.[Crossref]
  35. Delcher AL, Bratke KA, Powers EC, Salzberg SL, , 2007. Identifying bacterial genes and endosymbiont DNA with Glimmer. Bioinformatics 23: 673679.[Crossref]
  36. Hyatt D, Chen GL, Locascio PF, Land ML, Larimer FW, Hauser LJ, , 2010. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11: 119.[Crossref]
  37. Besemer JLA, Borodovsky M, , 2001. GeneMarkS—a self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions. Nucleic Acids Res 29: 26072618.[Crossref]
  38. Gene Ontology Consortium Blake JA, Dolan M, Drabkin H, Hill DP, Li N, Sitnikov D, Bridges S, Burgess S, Buza T, McCarthy F, Peddinti D, Pillai L, Carbon S, Dietze H, Ireland A, Lewis SE, Mungall CJ, Gaudet P, Chrisholm RL, Fey P, Kibbe WA, Basu S, Siegele DA, McIntosh BK, Renfro DP, Zweifel AE, Hu JC, Brown NH, Tweedie S, Alam-Faruque Y, Apweiler R, Auchinchloss A, Axelsen K, Bely B, Blatter M, Bonilla C, Bouguerleret L, Boutet E, Breuza L, Bridge A, Chan WM, Chavali G, Coudert E, Dimmer E, Estreicher A, Famiglietti L, Feuermann M, Gos A, Gruaz-Gumowski N, Hieta R, Hinz C, Hulo C, Huntley R, James J, Jungo F, Keller G, Laiho K, Legge D, Lemercier P, Lieberherr D, Magrane M, Martin MJ, Masson P, Mutowo-Muellenet P, O'Donovan C, Pedruzzi I, Pichler K, Poggioli D, Porras Millan P, Poux S, Rivoire C, Roechert B, Sawford T, Schneider M, Stutz A, Sundaram S, Tognolli M, Xenarios I, Foulgar R, Lomax J, Roncaglia P, Khodiyar VK, Lovering RC, Talmud PJ, Chibucos M, Giglio MG, Chang H, Hunter S, McAnulla C, Mitchell A, Sangrador A, Stephan R, Harris MA, Oliver SG, Rutherford K, Wood V, Bahler J, Lock A, Kersey PJ, McDowall DM, Staines DM, Dwinell M, Shimoyama M, Laulederkind S, Hayman T, Wang S, Petri V, Lowry T, D'Eustachio P, Matthews L, Balakrishnan R, Binkley G, Cherry JM, Costanzo MC, Dwight SS, Engel SR, Fisk DG, Hitz BC, Hong EL, Karra K, Miyasato SR, Nash RS, Park J, Skrzypek MS, Weng S, Wong ED, Berardini TZ, Huala E, Mi H, Thomas PD, Chan J, Kishore R, Sternberg P, Van Auken K, Howe D, Westerfield M, , , 2013. Gene Ontology annotations and resources. Nucleic Acids Res 41: D530D535.[Crossref]
  39. Quevillon E, Silventoinen V, Pillai S, Harte N, Mulder N, Apweiler R, Lopez R, , 2005. InterProScan: protein domains identifier. Nucleic Acids Res 33: 116120.[Crossref]
  40. Finn RD, Clements ABJ, Punta M, , 2014. The Pfam protein families database. Nucleic Acids Res 40: D222D230.[Crossref]
  41. Mi H, Poudel S, Muruganujan A, Casagrande JT, Thomas PD, , 2016. PANTHER version 10: expanded protein families and functions, and analysis tools. Nucleic Acids Res 44: D336D342.[Crossref]
  42. Haft DH, Selengut JD, Richter RA, Harkins D, Basu MK, Beck E, , 2013. TIGRFAMs and genome properties in 2013. Nucleic Acids Res 41: D387D395.[Crossref]
  43. Pedruzzi I, Rivoire C, Auchincloss AH, Coudert E, Keller G, de Castro E, Baratin D, Cuche BA, Bougueleret L, Poux S, Redaschi N, Xenarios I, Bridge A, , 2015. HAMAP in 2015: updates to the protein family classification and annotation system. Nucleic Acids Res 43: D1064D1070.[Crossref]
  44. Sigrist CJ, de Castro E, Cerutti L, Cuche BA, Hulo N, Bridge A, Bougueleret L, Xenarios I, , 2013. New and continuing developments at PROSITE. Nucleic Acids Res 41: D344D347.[Crossref]
  45. Wilson D, Madera M, Vogel C, Chothia C, Gough J, , 2007. The SUPERFAMILY database in 2007: families and functions. Nucleic Acids Res 35: D308D313.[Crossref]
  46. Attwood TK, Coletta A, Muirhead G, Pavlopoulou A, Philippou PB, Popov I, Roma-Mateo C, Theodosiou A, Mitchell AL, , 2012. The PRINTS database: a fine-grained protein sequence annotation and analysis resource—its status in 2012. Database 2012: bas019.[Crossref]
  47. Lees JG, Lee D, Studer RA, Dawson NL, Sillitoe I, Das S, Yeats C, Dessailly BH, Rentzsch R, Orengo CA, , 2014. Gene3D: multi-domain annotations for protein sequence and comparative genome analysis. Nucleic Acids Res 42: D240D245.[Crossref]
  48. Bru C, Courcelle E, Carrere S, Beausse Y, Dalmar S, Kahn D, , 2005. The ProDom database of protein domain families: more emphasis on 3D. Nucleic Acids Res 33: D212D215.[Crossref]
  49. Letunic I, Doerks T, Bork P, , 2009. SMART 6: recent updates and new developments. Nucleic Acids Res 37: D229D232.[Crossref]
  50. Lupas AVDM, Stock J, , 1991. Predicting coiled coils from protein sequences. Science 252: 11621164.[Crossref]
  51. Burge SW, Daub J, Eberhardt R, Tate J, Barquist L, Nawrocki EP, Eddy SR, Gardner PP, Bateman A, , 2013. Rfam 11.0: 10 years of RNA families. Nucleic Acids Res 41: D226D232.[Crossref]
  52. Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL, , 2009. BLAST+: architecture and applications. BMC Bioinformatics 10: 421.[Crossref]
  53. Temple S, , 2012. Using and understanding RepeatMasker. Methods Mol Biol 859: 2951.[Crossref]
  54. Benson G, , 1999. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res 27: 573580.[Crossref]
  55. Xu Z, Wang H, , 2007. LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res 35: W265W268.[Crossref]
  56. Siguier P, Perochon J, Lestrade L, Mahillon J, Chandler M, , 2006. ISfinder: the reference centre for bacterial insertion sequences. Nucleic Acids Res 34: D32D36.[Crossref]
  57. Grissa I, Vergnaud G, Pourcel C, , 2007. CRISPRFinder: a web tool to identify clustered regularly interspaced short palindromic repeats. Nucleic Acids Res 35: W52W57.[Crossref]
  58. Zhou Y, Liang Y, Lynch KH, Dennis JJ, Wishart DS, , 2011. PHAST: a fast phage search tool. Nucleic Acids Res 39: W347W352.[Crossref]
  59. Ping L, Zhang H, Zhai Dammer EB, Duong DM, Li N, Yan Z, Wu J, Xu P, , 2013. Quantitative proteomics reveals significant changes in cell shape and an energy shift after IPTG induction via an optimized SILAC. J Proteome Res 12: 59785988.[Crossref]
  60. Elias JE, Gygi SP, , 2007. Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nat Methods 4: 207214.[Crossref]
  61. Salzberg SL, Delcher AL, Kasif S, White O, , 1998. Microbial gene identification using interpolated Markov models. Nucleic Acids Res 26: 544548.[Crossref]
  62. Delcher AL, Bratke KA, Powers EC, Salzberg SL, , 2007. Identifying bacterial genes and endosymbiont DNA with Glimmer. Bioinformatics 23: 673679.[Crossref]
  63. Richardson EJ, Watson M, , 2012. The automatic annotation of bacterial genomes. Brief Bioinform 14: 112.[Crossref]

Data & Media loading...

Supplementary Data

Supplementary PDF

  • Received : 16 Mar 2016
  • Accepted : 17 May 2016

Most Cited This Month

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error