GREAT AT SMALL THINGS

0

Next generation sequencing (NGS) related services

WGS data are nowadays standardly used in bacterial taxonomy. For the proposal of new species, overall genome related index (OGRI) values should be calculated between the type strain of the proposed new species and the type strains of closely related species. Average nucleotide identity (ANI) values and digital DNA-DNA hybridization (dDDH) values are OGRI values that are commonly used.

For bacterial cultures that are used in the food or feed chain and that are subject to an application for authorization, the European Food Safety Authority (EFSA) requests to use WGS data in the frame of a risk assessment. WGS data should be used for the unequivocal taxonomic identification of the microorganism and for the characterization of their potential traits of concern, which may include resistance to antimicrobials of clinical relevance for humans and animals, virulence factors, production of known toxic metabolites (doi: 10.2903/j.efsa.2018.5206, doi: 10.2903/j.efsa.2021.6506).

 

At LMG we offer several NGS related services:

 

Genome sequencing of a pure culture

At LMG we determine a draft genome sequence of your strains or of strains from our collections. DNA is extracted at our collection. Library preparation and draft whole genome sequencing are outsourced. Paired-end sequence reads (2x151bp) are generated using a NextSeq 2000 platform (Illumina Inc., San Diego, CA, USA), generally 200 Mb to 300 Mb per DNA.

 

Genome sequence assembly

At LMG quality check, trimming of the raw sequence reads and genome assembly (de novo) are performed in-house using the Shovill pipeline. Quality check of the assembly is performed using QUAST and checkM. The latter provides robust estimates of genome completeness and contamination using sets of genes that are ubiquitous and single-copy within a phylogenetic lineage. The program barrnap is used to extract (a) 16S rRNA gene sequence(s) from the assembly. The EZBioCloud’s 16S-based ID webservice is used to find the closest relatives based upon these 16S rRNA gene sequences, and check the purity/identity of the final assemblies.

 

Identification based upon assembled genome sequence (WGS) data

At LMG we use WGS data of the strains of interest for the unequivocal taxonomic identification of the microorganisms. In general, we select publicly available genome sequences of type strains of closely related species for pairwise comparison through the calculation of ANI and/or dDDH values. The selection of the public WGS is based on 16S rRNA gene sequence similarities between the strains of interest and the type strains of the related species, PaSiT4 values obtained by comparing the WGS of the strains of interest to an in-house reference database containing karlin4 signatures generated from public WGS, and our taxonomic expertise.

 

The service report provided to our customers includes in case of a WGS based identification service, a material and methods part, and a results (i.e. genome features) and conclusions part (WGS based species-level identification results). Additionally, an excel file is provided with the obtained OGRI values on which the identifications are based. We also provide a link to a zip folder with your WGS data. This folder contains the demultiplexed raw reads (file type: fastq.gz), the final assembly of each WGS (file type: fa) and info on the quality of each assembly. Per order, we normally provide one service report.

 

Literature:

  • Chun J, Oren A, Ventosa A, Christensen H, Arahal DR, et al. Proposed minimal standards for the use of genome data for the taxonomy of prokaryotes. Int J Syst Evol Microbiol 2018;68:461–466.
  • European Food Safety Authority (EFSA). EFSA statement on the requirements for whole genome sequence analysis of microorganisms intentionally used in the food chain. EFSA J;19(7):6506. DOI: 10.2903/j.efsa.2021.6506
  • Goussarov G, Cleenwerck I, Mysara M, Leys N, Monsieurs P, et al. PaSiT: a novel approach based on short-oligonucleotide frequencies for efficient bacterial identification and typing. Bioinformatics 2020;36:2337–2344.
  • Gurevich A, Saveliev V, Vyahhi N, Tesler G. QUAST: Quality assessment tool for genome assemblies. Bioinformatics 2013;29:1072–1075.
  • Parks DH, Imelfort M, Skennerton CT, Hugenholtz P, Tyson GW. CheckM: Assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res 2015;25:1043–1055.
  • Rychen G, Aquilina G, Azimonti G, Bampidis V, Bastos M de L, et al. Guidance on the characterisation of microorganisms used as feed additives or as production organisms. EFSA J 2018;16:1–24. DOI: 10.2903/j.efsa.2018.5206
  • Yoon SH, Ha SM, Kwon S, Lim J, Kim Y, et al. Introducing EzBioCloud: A taxonomically united database of 16S rRNA gene sequences and whole-genome assemblies. Int J Syst Evol Microbiol 2017a;67:1613–1617.