This program runs on DNA sequencers and copies selected files from sequencing runs to a "landing zone" on the same machine.
The landingzones program then syncronizes the landing zone with a directory on a remote server.
This program is run regularly by a cron job.
For deployment at SSI, see the (private) repo rit-deploy-sequencer-sync.
For other users, e.g. non SSI users:
- Install Rust via
rustup: https://rustup.rs/ - Compile with
cargo build --releasefrom within this repo and find the binary intarget/release.
When sequencer-sync run is invoked, it:
- Loads and validates the config file
- Acquires a file lock to prevent concurrent runs
- Loads the transfer log (JSONL) which tracks previously transferred directories
- Scans the source directory for subdirectories not yet in the transfer log
- For each new directory, matches it against the configured categories by regex
- Skips directories where any configured completion file glob fails to match (i.e. the sequencing run is still in progress), unless
--transfer-incompleteis set - Transfers matching directories to the category's landing zone via
rsync -a, respecting exclude patterns found in config - Records success/failure in the transfer log; on success, writes a
transfer_successful.txtmarker in the transferred directory
- Previously failed transfers can be retried with
--retry-failed - If "redo" is manually set to true in the JSONL transfer log, previously transferred directories are re-transferred
- When the same directory is present multiple times in the transfer log, later entries override earlier.
- The file lock is not necessarily held if the lock file exists. Instead, the lock is managed with
flock()system calls. Use theflocktool to check if the lock is held.
-
sequencer-sync setup: Validate config file, check directories have correct permissions, and print cron tab--config-path(required): path to config file to load, see our deploy repo--skip-ssh-check: By default, setup will check that you have passwordless SSH access with username/host/port provided by the config file. If this option is set, skip that check.
-
sequencer-sync run: Synchronize files to the landing zone--config-path(required): path to config file to load, see our deploy repo--retry-failedA failed transfer is logged as unsuccessful in thelog/transferred-direcotries.jsonland skipped in future runs. If this flag is set, failed directories are not skipped (unless they also appear as succeeded later in the log).--transfer-incompleteData from sequencing runs are only considered complete if every glob incompletion_file_globsmatches at least one file. Without this flag set, incomplete runs are skipped.