This R package provides a programmatic interface to the Bacterial Diversity Metadatabase of the DSMZ (German Collection of Microorganisms and Cell Cultures). BacDiveR helps you improve your research on bacteria and archaea by providing access to “structured information on […] their taxonomy, morphology, physiology, cultivation, geographic origin, application, interaction” and more (Söhngen et al. 2016). Specifically, you can:

  • download the BacDive data you need for offline investigation, and

  • document your searchers and downloads in .R scripts, .Rmd files, etc.

Thus, BacDiveR can be the basis for a reproducible data analysis pipeline. See TIBHannover.GitHub.io/BacDiveR for more details, /news there for the changelog, and GitHub.com/TIBHannover/BacDiveR for the latest source code.

It was also built to serve as a demonstration object during TIB’s “FAIR Data & Software” workshop.

Installation

  1. Because the BacDive Web Service requires registration please do that first and wait for DSMZ staff to grant you access.

  2. Once you have your login credentials, install the latest BacDiveR release from GitHub with: if(!require('devtools')) install.packages('devtools'); devtools::install_github('TIBHannover/BacDiveR').

  3. After installing, follow the instructions on the console to save your login credentials locally and restart R(Studio) or run usethis::edit_r_environ() and ensure it contains the following:

BacDive_email=your.email@provider.org
BacDive_password=YOUR_20_char_password

In the examples and vignettes, the data retrieval will only work if your login credentials are correct in themselves (no typos) and were correctly saved. Console output like "{\"detail\": \"Invalid username/password\"}", or Error: $ operator is invalid for atomic vectors indicates that either the login credentials are incorrect, or the .Renviron file.

How to use

There are two main functions: retrieve_data() and retrieve_search_results(). Please click on their names to read their docu, and find real-life examples in the vignettes “BacDive-ing in” and “Pre-Configuring Advanced Searches”.

How to cite

Best execute citation('BacDiveR') in the R console and use its output because that ensures you are citing exactly the installed version.

You can also use the Cite as or Export options on the appropriate Zenodo record. If you want to import this GitHub repo’s metadata into a reference manager directly, I recommend Zotero and its GitHub translator. Please double-check, that the citation refers to the same version number that you ran your analysis with.

When using BibTeX, you may want to try changing the item type from to @Software ;-) Support for that is being worked on.

Don’t forget to also cite BacDive itself whenever you used their data, regardless of access method.

How to contribute: See CONTRIBUTING.md file.

Known issues: See bugs and ADRs.

Similar tools

These seem to scrape all data, instead of retrieving specific datasets.

References

  • Söhngen, Bunk, Podstawka, Gleim, Overmann. 2014. “BacDive — the Bacterial Diversity Metadatabase.” Nucleic Acids Research 42 (D1): D592–D599. doi:10.1093/nar/gkt1058.

  • Söhngen, Podstawka, Bunk, Gleim, Vetcininova, Reimer, Ebeling, Pendarovski, Overmann. 2016. “BacDive – the Bacterial Diversity Metadatabase in 2016.” Nucleic Acids Research 44 (D1): D581–D585. doi:10.1093/nar/gkv983.

  • Reimer, Vetcininova, Carbasse, Söhngen, Gleim, Ebeling, Overmann. 2018. “BacDive in 2019: Bacterial Phenotypic Data for High-Throughput Biodiversity Analysis” Nucleic Acids Research doi:10.1093/nar/gky879.