GREAT AT SMALL THINGS

0

BCCM/IHEM releases thousands of fungal DNA sequences

BCCM/IHEM is a collection of fungal strains dedicated to human and animal health. In the field of medical and veterinary mycology, correct identification of isolates is important, notably to apply proper treatment, but also to discriminate between true pathogens and possible contaminants. Several methods of identification can be used including microscopy, MALDI-TOF mass spectrometry and DNA sequencing. The latter is considered as the gold standard and relies on the comparison against databases of DNA sequences of known identity. In mycology, the most commonly used DNA marker is the internal transcribed spacer (ITS) and the access to reliable ITS sequences is thus key for correct identification. In the frame of its activities, BCCM/IHEM has long been performed ITS sequencing on its yeast and mould strains resulting in a dataset comprising thousands of references.

Following the FAIR guiding principles for scientific data management and stewardship*, BCCM/IHEM decided to share most of its ITS sequences through their release in the European Nucleotide Archive (ENA). In total, more than 6.500 sequences representing over 1.300 different species were made publicly available on the ENA database. They are also included in the GenBank (USA) and the DDBJ (Japan) databases through the international nucleotide sequence database collaboration. Each sequence received an accession number and is linked to an IHEM voucher. Moreover, information on the strains was provided for each sequence, including the source of isolation and the country of origin.

The addition of these sequences aimed at maximizing the scientific impact of the sequencing efforts realized by BCCM/IHEM during the last years. It also increases the fungal diversity in the genetic databases, which is helpful for the scientific community working on fungal taxonomy but also for laboratories that use DNA sequencing as an identification tool. With this release, BCCM/IHEM thus promotes the reuse of existing data in order to sustain research and disseminate scientific knowledge.

* Wilkinson et al. (2016), Scientific Data 3:160018, doi:10.1038/sdata.2016.18