Refactor download_url test into TestDownloadUrl class#8829
Refactor download_url test into TestDownloadUrl class#8829ericspod merged 6 commits intoProject-MONAI:devfrom
Conversation
for more information, see https://pre-commit.ci
📝 WalkthroughWalkthroughUpdates test-data URLs in tests/testing_data/data_config.json to point to the HuggingFace testing_data dataset. Modifies tests/test_utils.py: expands skip logic to treat ValueError messages containing "hash check" as skip-worthy, updates DOWNLOAD_FAIL_MSGS to use "hash check", adds SAMPLE_TIFF constants and a TestDownloadUrl unit test that verifies download success with a correct hash and that an incorrect hash raises RuntimeError. Changes monai/apps/utils.py: download_url now raises ValueError (with an expanded message) when a post-download hash check fails. Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/testing_data/data_config.json (1)
88-92:⚠️ Potential issue | 🟡 MinorTwo URLs were missed in the HF migration.
images.nrrd_example(line 89) andconfigs.test_meta_file(line 165) still point atgithub.com/Project-MONAI/MONAI-extra-test-data/releases/download/0.8.1/.... Per the PR objective (issue#8510), all test data URLs should move to the HF dataset. Please migrate these too (or note why they're exempt).#!/bin/bash # Confirm remaining GitHub Releases URLs in the config. rg -n 'MONAI-extra-test-data' tests/testing_data/data_config.jsonAlso applies to: 163-168
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/testing_data/data_config.json` around lines 88 - 92, The two remaining test-data entries images.nrrd_example and configs.test_meta_file still point to the old GitHub release; update their "url" fields to the corresponding HuggingFace dataset URLs used by the rest of the config (replace the github.com/Project-MONAI/MONAI-extra-test-data/releases/download/0.8.1/... links with the HF dataset path), and if the file content or location changed on HF, update the "hash_val" (keeping "hash_type": "sha256") to the new checksum; after changing images.nrrd_example and configs.test_meta_file in data_config.json, run the repo search used in the PR verification (e.g., rg -n 'MONAI-extra-test-data' tests/testing_data/data_config.json) to ensure no remaining GitHub-release links remain.
🧹 Nitpick comments (1)
tests/test_utils.py (1)
195-215: Consider sourcing URL/hash fromtesting_data_configinstead of hard-coding.
wsi_generic_tiffindata_config.jsonalready holds the same URL/hash_type/hash_val. Usingtesting_data_config("images", "wsi_generic_tiff", ...)would avoid duplication and keep the test in sync if the config is ever updated.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_utils.py` around lines 195 - 215, The test_download_url test currently hard-codes SAMPLE_TIFF, SAMPLE_TIFF_HASH and SAMPLE_TIFF_HASH_TYPE; replace those with values loaded from the shared test config by calling testing_data_config("images", "wsi_generic_tiff", "url"), testing_data_config("images", "wsi_generic_tiff", "hash_val") and testing_data_config("images", "wsi_generic_tiff", "hash_type") (or destructure a single call if helper returns all fields) and pass those into download_url instead of the constants; update imports in tests/test_utils.py to include testing_data_config and keep the existing download_url usage and assertions unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/test_utils.py`:
- Around line 209-215: Replace the second transient download with a
deterministic hash-mismatch check: first download SAMPLE_TIFF once (guarded by
skip_if_downloading_fails()) into a temp filepath (e.g., model.tiff) using
download_url, then call download_url again with the same filepath but with the
incorrect hash_val to trigger the hash-validation branch and
assertRaises(RuntimeError); this avoids re-downloading to model_bad.tiff and
eliminates network flakiness—use the symbols download_url,
skip_if_downloading_fails, SAMPLE_TIFF, and SAMPLE_TIFF_HASH_TYPE to locate and
modify the test.
---
Outside diff comments:
In `@tests/testing_data/data_config.json`:
- Around line 88-92: The two remaining test-data entries images.nrrd_example and
configs.test_meta_file still point to the old GitHub release; update their "url"
fields to the corresponding HuggingFace dataset URLs used by the rest of the
config (replace the
github.com/Project-MONAI/MONAI-extra-test-data/releases/download/0.8.1/... links
with the HF dataset path), and if the file content or location changed on HF,
update the "hash_val" (keeping "hash_type": "sha256") to the new checksum; after
changing images.nrrd_example and configs.test_meta_file in data_config.json, run
the repo search used in the PR verification (e.g., rg -n 'MONAI-extra-test-data'
tests/testing_data/data_config.json) to ensure no remaining GitHub-release links
remain.
---
Nitpick comments:
In `@tests/test_utils.py`:
- Around line 195-215: The test_download_url test currently hard-codes
SAMPLE_TIFF, SAMPLE_TIFF_HASH and SAMPLE_TIFF_HASH_TYPE; replace those with
values loaded from the shared test config by calling
testing_data_config("images", "wsi_generic_tiff", "url"),
testing_data_config("images", "wsi_generic_tiff", "hash_val") and
testing_data_config("images", "wsi_generic_tiff", "hash_type") (or destructure a
single call if helper returns all fields) and pass those into download_url
instead of the constants; update imports in tests/test_utils.py to include
testing_data_config and keep the existing download_url usage and assertions
unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: d4c21ce8-3d39-4944-869e-df48b5adeeb6
📒 Files selected for processing (2)
tests/test_utils.pytests/testing_data/data_config.json
|
Please have a look at the DCO issue, we should be able to do a remedial commit to sort it. |
DCO Remediation Commit for Enoch Mok <enochmokny@gmail.com> I, Enoch Mok <enochmokny@gmail.com>, hereby add my Signed-off-by to this commit: d02da49 Signed-off-by: Enoch Mok <enochmokny@gmail.com>
for more information, see https://pre-commit.ci
|
Thanks @ericspod for your guidance throughout this, I am excited to have made my first contribution (albeit small)! I do wish to be doing this consistently - may I kindly ask if you guys have preference on which issues I should focus on, whether those in the 'Issues' tab or under MONAI v1.6 are more important to you guys? To future collaborations and contributions 🍻 |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
tests/test_utils.py (1)
179-187:⚠️ Potential issue | 🔴 CriticalBroken
exceptchain —rt_eis undefined and non-matchingRuntimeErrors are now swallowed.The new
except ValueErrorblock was inserted between the(RuntimeError, OSError)body and itsraise rt_e, so:
- Line 187
raise rt_enow lives inside theexcept ValueErrorhandler, wherert_eis not in scope (Ruff F821). If aValueErrorwithout"hash check"is raised, this path explodes withNameError.- The original intent — re-raising a
RuntimeError/OSErrorwhose message doesn't matchDOWNLOAD_FAIL_MSGS— is gone. Such errors are now silently swallowed, masking real failures.- Missing
fromon the re-raise (B904).Proposed fix
except (RuntimeError, OSError) as rt_e: err_str = str(rt_e) if any(k in err_str for k in DOWNLOAD_FAIL_MSGS): raise unittest.SkipTest(f"Error while downloading: {rt_e}") from rt_e # incomplete download + raise except ValueError as v_e: if "hash check" in str(v_e): raise unittest.SkipTest(f"Hash value error while downloading: {v_e}") from v_e - - raise rt_e + raise🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_utils.py` around lines 179 - 187, The except chain currently mis-scopes rt_e and swallows RuntimeError/OSError; move the logic so that the (RuntimeError, OSError) except (the block that defines rt_e) either re-raises the original exception when its message doesn't match DOWNLOAD_FAIL_MSGS (use "raise rt_e from rt_e" or simply "raise" inside that except to preserve traceback), and keep the ValueError except separate (handle "hash check" by raising unittest.SkipTest with v_e, otherwise re-raise the ValueError with "raise" or "raise v_e from v_e"); ensure rt_e is not referenced inside the except ValueError block and that all re-raises include proper "from" chaining where applicable.monai/apps/utils.py (1)
215-276:⚠️ Potential issue | 🔴 CriticalDocstring and breaking change:
RuntimeErrorvsValueErrorLine 223 documents
RuntimeErrorfor downloaded-file hash failure, but line 271 raisesValueError. This breaks both the docstring and downstream code:
tests/apps/test_download_and_extract.pyline 45 catches(ContentTooShortError, HTTPError, RuntimeError)—ValueErrorescapes uncaught.- Line 48 checks
str(e).startswith("md5 check"), but the new message is"md5 hash check ...", so the assertion fails even if caught.- External code catching
RuntimeErrorarounddownload_urlis similarly broken.Fix: Change line 271 back to
RuntimeError(simplest, non-breaking) or update the docstring at line 223 and all call sites together.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@monai/apps/utils.py` around lines 215 - 276, The post-download hash-failure currently raises ValueError (the raise after logger.info and check_hash) which conflicts with the docstring/tests expecting a RuntimeError and a message starting with "{hash_type} check"; change that raise to raise RuntimeError and restore the message prefix to "{hash_type} check of downloaded file failed: URL={url}, filepath={filepath}, expected {hash_type}={hash_val}..." so callers/tests that catch RuntimeError and check str(e).startswith("{hash_type} check") will continue to work; keep the check_hash call and other surrounding logic (functions: check_hash, _download_with_progress, _basename) unchanged.
🧹 Nitpick comments (1)
tests/test_utils.py (1)
195-218: Docstring nit: document the actual class contract.Per the repo's Python guidelines (Google-style docstrings documenting raised exceptions), the class/method docstrings here say
Raises: RuntimeErrorbut post-fix the hash-mismatch path should beValueError. Update once the exception type is settled (see comment on Line 212).As per coding guidelines: "Docstrings should be present for all definition which describe each variable, return value, and raised exception in the appropriate section of the Google-style of docstrings."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_utils.py` around lines 195 - 218, Update the docstring and test to reflect the correct exception contract for download_url: change the "Raises: RuntimeError" entry in the test_download_url docstring to "Raises: ValueError" and update the assertion that expects an exception (self.assertRaises(RuntimeError)) to expect ValueError instead; refer to the TestDownloadUrl class and its test_download_url method and the download_url call to ensure the docstring and assertion match the download_url behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/test_utils.py`:
- Line 82: The test expectation for the hash-failure message doesn't match the
RuntimeError raised in download_url's existing-file branch
(RuntimeError(f"{hash_type} check of existing file failed: ...")) — update the
test in tests/test_utils.py to assert a common substring present in both
messages (e.g., " check" or include both variants like f"{hash_type} check" and
"hash check") or change the assertion to check for f"{hash_type} check" so it
matches the existing-file error and the downloaded-file error consistently;
reference the download_url function and the RuntimeError message formatting when
making the change.
- Around line 212-218: The test currently expects a RuntimeError but
download_url now raises ValueError on the fresh-download/hash-mismatch path:
update the assertion to use self.assertRaises(ValueError) for the download_url
call (identify the call by download_url(..., filepath=os.path.join(tempdir,
"model_bad.tiff"), hash_val="0"*64, hash_type=SAMPLE_TIFF_HASH_TYPE)).
Additionally, wrap that download_url invocation in a narrow network-exception
guard so CI flakes don’t fail the assertion: catch network-related exceptions
(e.g., requests.exceptions.RequestException / urllib.error.URLError / OSError)
and call or re-use skip_if_downloading_fails() (or otherwise skip the test) for
those cases, but do not catch or mask the ValueError path so the assertRaises
still verifies the hash-error behavior.
---
Outside diff comments:
In `@monai/apps/utils.py`:
- Around line 215-276: The post-download hash-failure currently raises
ValueError (the raise after logger.info and check_hash) which conflicts with the
docstring/tests expecting a RuntimeError and a message starting with
"{hash_type} check"; change that raise to raise RuntimeError and restore the
message prefix to "{hash_type} check of downloaded file failed: URL={url},
filepath={filepath}, expected {hash_type}={hash_val}..." so callers/tests that
catch RuntimeError and check str(e).startswith("{hash_type} check") will
continue to work; keep the check_hash call and other surrounding logic
(functions: check_hash, _download_with_progress, _basename) unchanged.
In `@tests/test_utils.py`:
- Around line 179-187: The except chain currently mis-scopes rt_e and swallows
RuntimeError/OSError; move the logic so that the (RuntimeError, OSError) except
(the block that defines rt_e) either re-raises the original exception when its
message doesn't match DOWNLOAD_FAIL_MSGS (use "raise rt_e from rt_e" or simply
"raise" inside that except to preserve traceback), and keep the ValueError
except separate (handle "hash check" by raising unittest.SkipTest with v_e,
otherwise re-raise the ValueError with "raise" or "raise v_e from v_e"); ensure
rt_e is not referenced inside the except ValueError block and that all re-raises
include proper "from" chaining where applicable.
---
Nitpick comments:
In `@tests/test_utils.py`:
- Around line 195-218: Update the docstring and test to reflect the correct
exception contract for download_url: change the "Raises: RuntimeError" entry in
the test_download_url docstring to "Raises: ValueError" and update the assertion
that expects an exception (self.assertRaises(RuntimeError)) to expect ValueError
instead; refer to the TestDownloadUrl class and its test_download_url method and
the download_url call to ensure the docstring and assertion match the
download_url behavior.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: e6a9e321-e574-4979-b315-d9e2ce06b551
📒 Files selected for processing (2)
monai/apps/utils.pytests/test_utils.py
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
tests/test_utils.py (2)
198-218:⚠️ Potential issue | 🔴 CriticalExpect
ValueErrorfor the bad-hash download.
download_urlnow raisesValueErroron the fresh-download hash-mismatch path, so this assertion and docstring are stale.Proposed fix
- Raises: - RuntimeError: When the downloaded file's hash does not match. + Raises: + AssertionError: If the bad hash does not fail validation. @@ - with self.assertRaises(RuntimeError): + with self.assertRaises(ValueError): download_url( url=SAMPLE_TIFF, filepath=os.path.join(tempdir, "model_bad.tiff"),🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_utils.py` around lines 198 - 218, Update the test and docstring to expect ValueError instead of RuntimeError for a hash-mismatch on a fresh download: change the test_download_url docstring "Raises" section and the assertion using self.assertRaises(RuntimeError) to self.assertRaises(ValueError), referencing the download_url function and the test_download_url test that triggers the bad-hash path.
78-87:⚠️ Potential issue | 🟡 MinorKeep a sentinel for existing-file hash failures too.
"hash check"matches downloaded-file failures, but not the existing-file message pattern likesha256 check of existing file failed, so cached hash failures may no longer be skipped.Proposed fix
- "hash check", # check hash value of downloaded file + "hash check", # check hash value of downloaded file + "check of existing file failed", # check hash value of existing file🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_utils.py` around lines 78 - 87, The DOWNLOAD_FAIL_MSGS tuple used to detect download/cache failures misses the cached-file hash-failure pattern, so add an additional sentinel string that matches messages like "sha256 check of existing file failed" (or a broader phrase such as "check of existing file failed" or "sha256 check") to DOWNLOAD_FAIL_MSGS so cached hash failures are recognized and skipped; update the DOWNLOAD_FAIL_MSGS definition (the DOWNLOAD_FAIL_MSGS symbol) to include that sentinel alongside the existing "hash check" entry.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/test_utils.py`:
- Around line 185-187: The except block in tests/test_utils.py catches
ValueError as v_e and only raises SkipTest when the message contains "hash
check", but swallows all other ValueError instances; update the except
ValueError handler (the branch handling v_e) so that if "hash check" is not in
str(v_e) you re-raise the original ValueError (raise v_e) instead of letting it
pass silently—keep the existing SkipTest behavior when the message matches.
---
Duplicate comments:
In `@tests/test_utils.py`:
- Around line 198-218: Update the test and docstring to expect ValueError
instead of RuntimeError for a hash-mismatch on a fresh download: change the
test_download_url docstring "Raises" section and the assertion using
self.assertRaises(RuntimeError) to self.assertRaises(ValueError), referencing
the download_url function and the test_download_url test that triggers the
bad-hash path.
- Around line 78-87: The DOWNLOAD_FAIL_MSGS tuple used to detect download/cache
failures misses the cached-file hash-failure pattern, so add an additional
sentinel string that matches messages like "sha256 check of existing file
failed" (or a broader phrase such as "check of existing file failed" or "sha256
check") to DOWNLOAD_FAIL_MSGS so cached hash failures are recognized and
skipped; update the DOWNLOAD_FAIL_MSGS definition (the DOWNLOAD_FAIL_MSGS
symbol) to include that sentinel alongside the existing "hash check" entry.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: e8f5b280-b860-48e2-8ba3-4aef3a1f0fe9
📒 Files selected for processing (1)
tests/test_utils.py
|
Hi @e-mny further contributions are most welcome! Please have a look at issues, there are some marked as good first issues but check that no one has started work on them yet. The 1.6 project needs updating with items that are a priority so I wouldn't worry about what's there currently. |
Fixes #8510
Description
Types of changes
./runtests.sh -f -u --net --coverage../runtests.sh --quick --unittests --disttests.