The purpose of this checklist is to aid reviewers of data-intensive microbiology studies in ensuring that final research products make datasets accessible for secondary analyses upon publication. Please see our Comment on metaRxiv for our reasoning behind the checklist.
This is meant to be a living document, and we welcome input from the community to expand, clarify, or fine-tune the suggestions listed below.
Please consider the following suggestions in your review to ensure the availability of raw datasets, clear links between sample names and raw data, and transparent descriptions of sample names and sample groups, preferably in flat-text files:
-
Availability of Raw Data
- Confirm that raw datasets are available at the time of review. Notify the editor immediately if data are not accessible.
-
Sample-Accession Number Asssociations
- Verify that accession numbers provided in the manuscript resolve to meaningful files. Note that some repositories issue accession numbers before making data public; anonymous access to data must be available to you as a reviewer. If unsure, attempt to download at least one raw data file to confirm accessibility.
-
Sample-Metadata Associations
- Ensure that sample labels, sample groups, and metadata are not embedded in binary files (e.g., PDFs or images). These should be provided in a Supplementary Table that allows easy copying, sorting, and searching with standard tools.
-
Supplementary Table Requirements
-
Verify that the Supplementary Table is cited in the Data Availability section and that within this table:
- Each accession number linked to its corresponding sample name (and vice versa).
- Each sample name linked to relevant metadata (e.g., environmental variables or host characteristics).
- Clear group labels match those used in the manuscript, included as a separate column to enable retrospective linkage. Ideally, this table should allow anyone to identify which sample groups were used for each hypothesis test described in the manuscript.
-
-
Clarity Check
- If these checks require more than five minutes of your time after reading the study, include a suggestion in your review for the authors to improve the presentation of their public data.
-
Reporting Back to the Editor
- Inform the editor whether you reviewed the data availability in the study and if you were satisfied with accessibility of public data items. Even if the study is strong in other respects, do not approve it until a revised manuscript fully addresses any data availability concerns.
On behalf of all the early-career researchers who often end up spending a tremendous amount of time to recover published datasets in usable forms for secondary analyses, we appreciate your diligence in maintaining high standards of data transparency and reproducibility in microbiology research.