Relating next-generation sequencing and bioinformatics concepts to routine microbiological testing
More details
Hide details
Faculdade de Ciências Farmacêuticas de Ribeirão Preto, Universidade de São Paulo, Brazil
Elaine Cristina Pereira De Martinis   

Faculdade de Ciências Farmacêuticas de Ribeirão Preto, Universidade de São Paulo, Brazil
Online publish date: 2019-05-02
Publish date: 2019-05-02
Electron J Gen Med 2019;16(3):em136
Next-Generation Sequencing (NGS) is becoming a reality in the clinical microbiology laboratory because it can speed diagnosis when compared to traditional culture based-methods and moreover, to aid in unravelling key virulence traits of important pathogens. Nonetheless, there are many limitations for its wide application in routine testing, as the requirement of high performance hardware and software to support bioinformatics analysis, as well as the expertise in different programming languages to perform the analyses. In this context, this review was drawn to synthesize some basic concepts involved in NGS for Whole-Genome Sequencing (WGS), based on two international straightforward efforts to standardize WGS data acquisition and processing in the clinical routine, the PulseNet International and the ENGAGE project, allied with other tools available for WGS analysis, beginning from the available sequencing platforms to the main user-friendly pipelines dedicated for the pathogen identification, including the use of properly databases to search for virulence factors, resistance genes and software resources for molecular typing of isolates.
Weile J, Knabbe C. Current applications and future trends of molecular diagnostics in clinical bacteriology. Anal Bioanal Chem. 2009;394(3):731-42. PMid:19377839.
Pitt TL, Saunders NA. Molecular bacteriology: a diagnostic tool for the millennium. J Clin Pathol. 2000;53:71-5. PMCid:PMC1731063.
Pierro A, Sambril V. Molecular methods: are their results of help or they make more confuse the clinical management of patients? Microbiologia Medica. 2016;31(4):97-8.
Srinivasan R, Karaoz U, Volegova M, et al. Use of 16S rRNA gene for identification of a broad range of clinically relevant bacterial pathogens. PLoS ONE. 2015;10(2):e0117617.
Clarridge III, JE. Impact of 16S rRNA gene sequence analysis for identification of bacteria on clinical microbiology and infectious diseases. Clin Microbiol Rev. 2004; 17(4):840-62. PMid:15489351 PMCid:PMC523561.
Yarza P, Yilmaz P, Pruesse E, et al. Uniting the classification of cultured and uncultured bacteria and archaea using 16S rRNA gene sequences. Nat Rev Microbiol. 2014;12(9):635-45. PMid:25118885.
Janda JM, Abbott SL. 16S rRNA gene sequencing for bacterial identification in the diagnostic laboratory: pluses, perils, and pitfalls. J Clin Microbiol. 2007;45(9):2761-4. PMid:17626177 PMCid:PMC2045242.
Schloss PD, Gevers D, Westcott SL. Reducing the effects of PCR amplification and sequencing artifacts on 16S rRNA-based studies. PLoS ONE. 2011;6(12):e27310.
Patel JB. 16S rRNA gene sequencing for bacterial pathogen identification in the clinical laboratory. Mol Diagn. 2001;6(4):313-21.
Deurenberg RH, Bathoorn E, Chlebowicz MA, et al. Application of Next-Generation Sequencing in clinical microbiology and infection prevention. J Biotechnol. 2017;243:16-24. PMid:28042011.
Koboldt DC, Steinberg KM, Larson DE, Wilson RK, Mardis ER. The Next-Generation Sequencing revolution and its impact on genomics. Cell. 2013;155(1):27-38. PMid:24074859 PMCid:PMC3969849.
Mardis ER. The impact of Next-Generation Sequencing technology on genetics. Trends Genet. 2008;24(3):133-41. PMid:18262675 PMCid:PMC2680276.
Buchan BW, Ledeboer NA. Emerging technologies for the clinical microbiology laboratory. Clin Microbiol Rev. 2014;27(4):783-822. PMid:25278575 PMCid:PMC4187641.
Ansorge WJ. Next-Generation DNA Sequencing techniques. N Biotechnol. 2009; 25(4):195-203. PMid:19429539.
Macori G, Romano A, Adriano D, et al. Draft genome sequences of four Yersinia enterocolitica strains, isolated from wild ungulate carcasses. Genome Announc. 2017; 5(15). pii: e00192-17.
Stevens MJ, Stephan R, Johler S. Draft genome sequence of Staphylococcus aureus 1608, a strain that caused toxic mastitis in twin cows. Genome Announc. 2017;5(1). pii: e01438-16.
Schmidt T, Kock MM, Ehlers MM. Molecular characterization of Staphylococcus aureus isolated from bovine mastitis and close human contacts in South African dairy herds: genetic diversity and inter-species host transmission. Front Microbiol. 2017; 8:511. PMid:28428772 PMCid:PMC5382207.
Kluytmans JA. Methicillin-resistant Staphylococcus aureus in food products: cause for concern or case for complacency? Clin Microbiol Infect. 2010; 16(1):11-5. PMid:20002686.
Tsogalis GJ, Chao E, Hagenkord JM, Hambuch T, Moore JH. Bioinformatics: what the clinical laboratorian needs to know and prepare for. Clin Chem. 2013; 59(9):1301-5. PMid:23723312.
Almeida OGG, De Martinis ECP. Bioinformatics tools to assess metagenomic data for applied microbiology. Appl Microbiol Biotechnol. 2019; 103(1):69-82. PMid:30362076.
El-Metwally S, Hamza T, Zakaria M, Helmy M. Next-generation sequence assembly: four stages of data processing and computational challenges. PLOS Comput Biol. 2013; 9(12):e1003345.
van Djik EL, Auger H, Jaszczyszyn Y, Thermes C. Ten years of Next-Generation Sequencing technology. Trends Genet. 2014;30(9):418-26. PMid:25108476.
Meyer M, Kircher M. Illumina sequencing library preparation for highly multiplexed target capture and sequencing. Cold Spring Harb Protoc. 2010;2010(6):pdb.prot5448.
Head SR, Komori HK, LaMere SA, et al. Library construction for Next-Generation Sequencing: overviews and challenges. Biotechniques. 2014; 56(2):61-4, 66, 68, passim.
Török ME, Peacock SJ. Rapid Whole-Genome Sequencing of bacterial pathogens in the clinical microbiology laboratory-pipe dream or reality?. J Antimicrob Chemother. 2012;67(10):2307-08. PMid:22729921.
Varshney RK, Nayak SN, May GD, Jackson SA. Next-Generation Sequencing technologies and their implications for crop genetics and breeding. Trends Biotechnol. 2009;27(9):522-30. PMid:19679362.
Ronaghi M. Pyrosequencing sheds light on DNA sequencing. Genome Res. 2001;11(1):3-11. PMid:11156611.
Goodwin S, McPherson JD, McCombie R. Coming of age: ten years of Next-Generation Sequencing technologies. Nat Rev Genet. 2016;17(6):333-51. PMid:27184599.
Schadt EE, Truner S, Kasarskis A. A window into third-generation sequencing. Hum Mol Genet. 2010;19(R2):R227-40.
Heather JM, Chain B. The sequence of sequencers: the history of sequencing DNA. Genomics. 2016;107(1):1-8. PMid:26554401 PMCid:PMC4727787.
Lu H, Giordano F, Ning Z. Oxford Nanopore MinION sequencing and genome assembly. Genomics Proteomics Bioinformatics. 2016;14(5):265-79. PMid:27646134 PMCid:PMC5093776.
Kulkarni P, Frommolt P. Challenges in the setup of large-scale Next-Generation Sequencing analysis workflows. Comput Struct Biotechnol J. 2017;15:471-7. PMid:29158876 PMCid:PMC5683667.
Kircher M, Kelso J. High-throughput DNA sequencing--concepts and limitations. Bioessays. 2010;32(6):524-36. PMid:20486139.
Quail MA, Smith M, Coupland P, et al. A tale of three Next Generation Sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers. BMC Genomics. 2012;13:341. PMid:22827831 PMCid:PMC3431227.
Mikheyev AS, Tin MM. A first look at the Oxford Nanopore MinION sequencer. Mol Ecol Resour. 2014;14(6):1097-102. PMid:25187008.
Illumina sequencing platforms. [online] [accessed 2019-01-13]. Available from:
Bioinformatics Definition Committee. NIH working definition of bioinformatics and computational biology. 2000 [online] [accessed 2019-01-13]. Available from:
Fierro RG, Thomas-Lopez D, Deserio D, Liebana E, Rizzi V, Guerra B. Outcome of EC/EFSA questionnaire (2016) on use of Whole Genome Sequencing (WGS) for food- and waterborne pathogens isolated from animals, food, feed and related environmental samples in EU/EFTA countries. EFSA Supporting Publications. 2018;15(6):1432E.
Nadon C, Van Walle I, Gerner-Smidt P, et al. PulseNet International: vision for the implementation of Whole Genome Sequencing (WGS) for global food-borne disease surveillance. Euro Surveill. 2017;22(23). pii: 30544.
Hendriksen RS, Pedersen SK, Leekitcharoenphon P, et al. Final report of ENGAGE ‐ Establishing Next Generation sequencing Ability for Genomic Analysis in Europe. EFSA Supporting Publications. 2018;15(6):1431E.
Edgar RC, Flyvbjerg H. Error filtering, pair assembly and error correction for next-generation sequencing reads. Bioinformatics. 2015;31(21):3476-82. PMid:26139637.
Wajid B, Serpedin E. Do it yourself guide to genome assembly. Brief Funct Genomics. 2016;15(1):1-9. PMid:25392234.
Centers for Disease Control and Prevention (USA). WGS protocols. [online] [accessed 2019-01-13]. Available from:
Patel RK, Jain M. NGS QC toolkit: A toolkit for quality control of Next Generation Sequencing data. PloS One. 2012;7(2):e30619.
Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30(15):2114-20. PMid:24695404 PMCid:PMC4103590.
Martin M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.journal. 2011;17(1):10-2.
Illumina®. BaseSpace sequence Hub. [online] [accessed 2019-01-13]. Available from:
Thermo Fisher. NextGENeTM Software for Ion TorrentTM Academic/Network License. [online] [accessed 2019-01-13]. Available from:
PACBIO®. SMRT Analysis Software. [online] [accessed 2019-01-13]. Available from:
Del Fabbro C, Scalabrin S, Morgante M, Giorgi FM. An extensive evaluation of read trimming effects on Illumina NGS data analysis. PLoS ONE. 2013;8(12):e85024.
Miller JR, Koren S, Sutton G. Assembly algorithms for Next-Generation Sequencing data. Genomics. 2010; 95(6):315-27. PMid:20211242 PMCid:PMC2874646.
Ekblom R, Wolf JB. A field guide to Whole-Genome Sequencing, assembly and annotation. Evol Appl. 2014;7(9):1026-42. PMid:25553065 PMCid:PMC4231593.
Bankevich A, Nurk S, Antipov D, et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol. 2012;19(5):455-77. PMid:22506599 PMCid:PMC3342519.
Wang W, Li GW, Chen C, Xie XS, Zhuang X. Chromosome organization by a nucleoid-associated protein in live bacteria. Science. 2011;333(6048):1445-9. PMid:21903814 PMCid:PMC3329943.
Smalla K, Jechalke S, Top EM. Plasmid detection, characterization and ecology. Microbiol Spectr. 2015;3(1):PLAS-0038-2014.
Arredondo-Alonso S, Willems RJ, van Schaik W, Schürch AC. On the (im)possibility of reconstructing plasmids from whole-genome short-read sequencing data. Microb Genom. 2017;3(10):e000128.
Antipov D, Hartwick N, Shen M, Raiko M, Lapidus A, Pevzner PA. plasmidSPAdes: assembling plasmids from Whole Genome Sequencing data. Bioinformatics. 2016;32(22):3380-7.
Marchesi JR, Ravel J. The vocabulary of microbiome research: a proposal. Microbiome. 2015;3:31. PMid:26229597 PMCid:PMC4520061.
Leipzig J. A review of bioinformatic pipeline frameworks. Brief Bioinform. 2017;18(3):530-6.
Naccache SN, Federman S, Veeeraraghavan N, et al. A cloud-compatible bioinformatics pipeline for ultrarapid pathogen identification from Next-Generation Sequencing of clinical samples. Genome Res. 2014; 24(7):1180-92. PMid:24899342 PMCid:PMC4079973.
Kilianski A, Carcel P, Yao S, et al. pathogens detection and characterization through a web-based, open source informatics platform. BMC Bioinformatics. 2015;16:416. PMid:26714571 PMCid:PMC4696252.
Byrd AL, Perez-Rogers JF, Manimaran S, et al. Clinical PathoScope: rapid alignment and filtration for accurate pathogen identification in clinical samples using unassembled sequencing data. BMC Bioinformatics. 2014;15(1):262. PMid:25091138 PMCid:PMC4131054.
Blankenberg D, Kuster GV, Coraor N, et al. Galaxy, a web-based genome analysis tool for experimentalists. Curr Protoc Mol Biol. 2010;Chapter 19:Unit 19.10.1-21.
De Summa, S Malerba, G Pinto, R Mori, A Mijatovic V, Tommasi S. GATK hard filtering: tunable parameters to improve variant calling for Next Generation Sequencing targeted gene panel data. BMC Bioinformatics. 2017;18(Suppl 5):119.
Wilson JW, Schurr MJ, LeBlanc CL, Ramamurthy R, Buchanan KL, Nickerson CA. Mechanisms of bacterial pathogenicity. Postgrad Med J. 2002;78(918):216-24. PMid:11930024 PMCid:PMC1742320.
Chen L, Yang J, Yu J, et al. VFDB: a reference database for bacterial virulence factors. Nucleic Acids Res. 2005;33(database issue):D325-8.
Chen L, Xiong Z, Sun L, Yang J, Jin Q. VFDB 2012 update: toward the genetic diversity and molecular evolution of bacterial virulence factors. Nucleic Acids Res. 2011;40(database issue):D641-5.
Zankari E, Hasman H, Cosentino S, et al. Identification of acquired antimicrobial resistance genes. J Antimicrob Chemother. 2012;67(11):2640-4. PMid:22782487 PMCid:PMC3468078.
Gupta SK, Padmanabhan BR, Diene SM, et al. ARG-ANNOT, a new bioinformatic tool to discover antibiotic resistance genes in bacterial genomes. Antimicrob Agents Chemother. 2014;58(1):212-20. PMid:24145532 PMCid:PMC3910750.
Zankari E. Comparison of the web tools AR-ANNOT and ResFinder for detection of resistance genes in bacteria. Antimicrob Agents Chemother. 2014;58(8):4986. PMid:25028728 PMCid:PMC4136053.
Jia B, Raphenya AR, Alcock B, et al. CARD 2017: expansion and model-centric curation of the comprehensive antibiotic resistance database. Nucleic Acids Res. 2017; 45(D1):D566-D573.
Pérez-Losada M, Cabezas P, Castro-Nallar E, Crandall KA. Pathogen typing in the genomics era: MLST and the future of molecular epidemiology. Infect Genet Evol. 2013;16:38-53. PMid:23357583.
Dekker JP, Frank KM. Next-generation epidemiology: using real-time core genome multilocus sequence typing to support infection control policy. J Clin Microbiol. 2016; 54(12):2850-3. PMid:27629902 PMCid:PMC5121370.
Larsen MV, Cosentino S, Rasmussen S, et al. Multilocus sequence typing of total-genome-sequenced bacteria. J Clin Microbiol. 2012;50(4):1355-61. PMid:22238442 PMCid:PMC3318499.
Maiden MC, Bygraves JA, Feil E, et al. Multilocus sequence typing: a portable approach to the identification of clones within populations of pathogenic microorganisms. Proc Natl Acad Sci U S A. 1998;95(6):3140-5. PMid:9501229 PMCid:PMC19708.
Besser J, Carleton HA, Gerner-Smidt P, Lindsey RL, Trees E. Next-Generation Sequencing technologies and their application to the study and control of bacterial infections. Clin Microbiol Infect. 2018;24(4):335-41. PMid:29074157 PMCid:PMC5857210.
Ruan Z, Feng Y. BacWGSTdb, a database for genotyping and source tracking bacterial pathogens. Nucleic Acids Res. 2016; 44(D1):D682-7.
Liu YY, Chiou CS, Chen CC. PGAdb-builder: A web service tool for creating pan-genome allele database for molecular fine typing. Sci Rep. 2016;6:36213. PMid:27824078 PMCid:PMC5099940.
Leekitcharoenphon P, Nielsen EM, Kaas RS, Lund O, Aarestrup FM. Evaluation of Whole Genome Sequencing for outbreak detection of Salmonella enterica. PLoS One. 2014;9(2):e87991.
Inns T, Ashton PM, Herrera-Leon S, et al. Prospective use of Whole Genome Sequencing (WGS) detected a multi-country outbreak of Salmonella Enteritidis. Epidemiol Infec. 2017;145(2):289-98. PMid:27780484.
Joensen KG, Scheutz F, Lund O, et al. Real-time Whole-Genome Sequencing for routine typing, surveillance, and outbreak detection of verotoxigenic Escherichia coli. J Clin Microbiol. 2014;52(5):1501-10. PMid:24574290 PMCid:PMC3993690.
Macori G, Romano A, Decastelli L, Cotter, PD. Build the read: a hands-on activity for introducing microbiology students to Next-Generation DNA Sequencing and bioinformatics. J Microbiol Biol Educ. 2017;18(3). pii: 18.3.62.