From b76696be5d1c02231ea370ed5ccdfab0fc25ac16 Mon Sep 17 00:00:00 2001 From: SexyERIC0723 Date: Wed, 8 Apr 2026 17:24:47 +0100 Subject: [PATCH 1/2] docs: add cloud URI access documentation Document the fsspec-based cloud/remote file access feature that was added in PR #523 but never documented. Covers: - Supported protocols (S3, GCS, Azure, HTTPS) - Installation of cloud extras - Usage examples for rdrecord and rdann with remote paths - Authentication guidance for credentialed cloud providers Closes #547 --- docs/io.rst | 43 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 43 insertions(+) diff --git a/docs/io.rst b/docs/io.rst index 80831a4b..76dd6cd9 100644 --- a/docs/io.rst +++ b/docs/io.rst @@ -30,6 +30,49 @@ WFDB Annotations :members: wrann +Cloud and Remote Access +----------------------- + +WFDB-Python supports reading records and annotations directly from cloud +storage and remote URLs via the ``fsspec`` library. Instead of downloading +entire databases, you can access individual files on demand. + +**Supported protocols** include ``s3://`` (Amazon S3), ``gs://`` (Google +Cloud Storage), ``az://`` (Azure Blob Storage), ``https://``, and any +other protocol supported by ``fsspec``. + +**Installation:** + +.. code-block:: bash + + pip install wfdb[cloud] + # or: pip install fsspec s3fs (for S3 specifically) + +**Usage examples:** + +.. code-block:: python + + import wfdb + + # Read a record from an HTTPS URL + record = wfdb.rdrecord("100", pn_dir="https://physionet.org/files/mitdb/1.0.0/") + + # Read from Amazon S3 + record = wfdb.rdrecord("s3://my-bucket/wfdb-data/100") + + # Read annotations from a remote path + ann = wfdb.rdann("100", "atr", pn_dir="https://physionet.org/files/mitdb/1.0.0/") + +**Authentication:** For cloud providers requiring credentials (S3, GCS, +Azure), configure authentication through the standard provider-specific +mechanism (e.g., ``~/.aws/credentials`` for S3, ``GOOGLE_APPLICATION_CREDENTIALS`` +for GCS). The ``fsspec`` library handles credential discovery automatically. + +For PhysioNet databases that require credentialed access, you can pass +credentials via ``fsspec`` storage options or configure them in your +environment before calling ``wfdb`` functions. + + Downloading ----------- From 1db260d1250dbd36f34754a6232d06a3c77c558d Mon Sep 17 00:00:00 2001 From: SexyERIC0723 Date: Wed, 8 Apr 2026 17:34:42 +0100 Subject: [PATCH 2/2] fix: correct cloud URI docs per Codex review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Narrow supported protocols to s3://, gs://, az://, azureml:// (matching actual code in _coreio.py) - Remove non-existent wfdb[cloud] install extra — fsspec is already a core dependency; document per-provider backend packages instead - Fix examples: cloud URIs go in record_name, not pn_dir - Remove storage_options reference (not in public API) - Add note clarifying pn_dir vs record_name semantics --- docs/io.rst | 55 +++++++++++++++++++++++++++++++---------------------- 1 file changed, 32 insertions(+), 23 deletions(-) diff --git a/docs/io.rst b/docs/io.rst index 76dd6cd9..a67fbdf7 100644 --- a/docs/io.rst +++ b/docs/io.rst @@ -30,23 +30,25 @@ WFDB Annotations :members: wrann -Cloud and Remote Access ------------------------ +Cloud Storage Access +-------------------- WFDB-Python supports reading records and annotations directly from cloud -storage and remote URLs via the ``fsspec`` library. Instead of downloading -entire databases, you can access individual files on demand. +storage via the ``fsspec`` library. Pass a cloud URI as the +``record_name`` argument instead of a local path. -**Supported protocols** include ``s3://`` (Amazon S3), ``gs://`` (Google -Cloud Storage), ``az://`` (Azure Blob Storage), ``https://``, and any -other protocol supported by ``fsspec``. +**Supported protocols:** ``s3://`` (Amazon S3), ``gs://`` (Google Cloud +Storage), ``az://`` (Azure Blob Storage), and ``azureml://`` (Azure ML). -**Installation:** +**Prerequisites:** Install the ``fsspec`` backend for your cloud provider: .. code-block:: bash - pip install wfdb[cloud] - # or: pip install fsspec s3fs (for S3 specifically) + pip install s3fs # Amazon S3 + pip install gcsfs # Google Cloud Storage + pip install adlfs # Azure Blob Storage + +``fsspec`` itself is already included as a core dependency of ``wfdb``. **Usage examples:** @@ -54,23 +56,30 @@ other protocol supported by ``fsspec``. import wfdb - # Read a record from an HTTPS URL - record = wfdb.rdrecord("100", pn_dir="https://physionet.org/files/mitdb/1.0.0/") - - # Read from Amazon S3 + # Read a record from Amazon S3 record = wfdb.rdrecord("s3://my-bucket/wfdb-data/100") - # Read annotations from a remote path - ann = wfdb.rdann("100", "atr", pn_dir="https://physionet.org/files/mitdb/1.0.0/") + # Read from Google Cloud Storage + record = wfdb.rdrecord("gs://my-bucket/wfdb-data/100") + + # Read annotations from S3 + ann = wfdb.rdann("s3://my-bucket/wfdb-data/100", "atr") + + # For PhysioNet databases, use pn_dir with the database name: + record = wfdb.rdrecord("100", pn_dir="mitdb") + ann = wfdb.rdann("100", "atr", pn_dir="mitdb") + +**Authentication:** Configure credentials through the standard +provider-specific mechanism (e.g., ``~/.aws/credentials`` for S3, +``GOOGLE_APPLICATION_CREDENTIALS`` for GCS). The ``fsspec`` library +handles credential discovery automatically. -**Authentication:** For cloud providers requiring credentials (S3, GCS, -Azure), configure authentication through the standard provider-specific -mechanism (e.g., ``~/.aws/credentials`` for S3, ``GOOGLE_APPLICATION_CREDENTIALS`` -for GCS). The ``fsspec`` library handles credential discovery automatically. +.. note:: -For PhysioNet databases that require credentialed access, you can pass -credentials via ``fsspec`` storage options or configure them in your -environment before calling ``wfdb`` functions. + Cloud URIs must be passed as ``record_name``, not ``pn_dir``. + The ``pn_dir`` parameter is reserved for PhysioNet database names + (e.g., ``"mitdb"`` or ``"mimic4wdb/0.1.0"``), which are resolved + against the configured PhysioNet index URL. Downloading