Skip to content

Fix KeyError when accessing metadata for PSP RFS CDF files#187

Open
MohamedAli1937 wants to merge 4 commits intosunpy:mainfrom
MohamedAli1937:fix-rfs-cdf-meta-key
Open

Fix KeyError when accessing metadata for PSP RFS CDF files#187
MohamedAli1937 wants to merge 4 commits intosunpy:mainfrom
MohamedAli1937:fix-rfs-cdf-meta-key

Conversation

@MohamedAli1937
Copy link
Copy Markdown

PR Description

Fixes #184

Summary:
This PR resolves a key mismatch between SpectrogramFactory and the RFSSpectrogram class that causes a KeyError when attempting to access metadata properties on spectrograms loaded from real Parker Solar Probe (PSP RFS) CDF files.

Motivation and Context:
Currently, SpectrogramFactory reads global attributes from CDF files using cdflib and passes them into the final metadata dictionary under the key "cdf_globals".
However, properties in RFSSpectrogram (specifically level and version in psp_rfs.py) attempt to read that data using self.meta["cdf_meta"].

Because of this mismatch, downloading a real PSP RFS file via Fido and calling spec.level throws an instant runtime KeyError: 'cdf_meta'. (The existing unit tests pass only because they mock the parse_path return value and manually inject mock metadata using the "cdf_meta" key.)

Changes:

  • Modified the output dictionary key from "cdf_globals" to "cdf_meta" in spectrogram_factory.py (for both PSP and SOLO RPW parsing blocks) to align the factory output with the keys expected by the source classes and the existing test suite.

Architectural Justification: Why modify the Factory instead of psp_rfs.py?
It might intuitively seem safer to change psp_rfs.py to look for "cdf_globals". However, surgically modifying spectrogram_factory.py is the strictly correct choice for the following reasons:

  1. Test Suite Compatibility: The existing test suite (e.g., test_psp_rfs.py, and test_solo_rpw.py) explicitly mocks metadata using the "cdf_meta" key.
  2. Zero-Regression Risk: A codebase-wide search confirms that the string key "cdf_globals" was used exclusively inside the factory output.
  3. Separation of Concerns: cdf_globals represents the raw output from the cdflib.globalattsget() function. Renaming the final exported dictionary key to "cdf_meta" abstracts the data layer away from low-level cdflib terminology, which is a cleaner architectural design.

Note on Implementation: The local Python variables inside the factory function remain named cdf_globals. This is intentional: they correctly describe the local return of cdflib, and leaving them as-is minimizes unnecessary line changes (diff noise) in the PR. Only the exported dictionary key string was updated (i.e., "cdf_meta": cdf_globals).

AI Assistance Disclosure

AI tools were used for:

  • Code generation (e.g., when writing an implementation or fixing a bug)
  • Test/benchmark generation
  • Documentation (including examples)
  • Research and understanding
  • No AI tools were used

Regardless of AI use, the human contributor remains fully responsible for correctness, design choices, licensing compatibility, and long-term maintainability.

The SpectrogramFactory stored CDF global attributes under the key 'cdf_globals' in the metadata dict, but RFSSpectrogram.level and RFSSpectrogram.version read from 'cdf_meta'. This mismatch causes a KeyError when accessing spec.level or spec.version on spectrograms created from real CDF files. The existing tests use 'cdf_meta' in their mock metadata, so they pass despite the bug. Renamed all 4 occurrences in the factory to 'cdf_meta' for consistency.
Copy link
Copy Markdown
Member

@samaloney samaloney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good but could you add a failing test using real data which this PR fixes?

@MohamedAli1937
Copy link
Copy Markdown
Author

MohamedAli1937 commented Mar 27, 2026

I added a real data test (test_psp_rfs_real_data) using @pytest.mark.remote_data and Fido.fetch.

While doing this, I found a hidden bug: mock data used "Data_type" as a string, but real CDF data (via cdflib) returns it as a list (e.g., ["L2>Level 2 Data"]). This caused: AttributeError: 'list' object has no attribute 'split' in psp_rfs.py.

Fixes in this PR:

  • Added the real data test to validate spec.level and spec.version.
  • Updated RFSSpectrogram to handle list values safely.

All tests pass locally, including the new real data test.

To run the test:

pytest --remote-data=any radiospectra/spectrogram/sources/tests/test_psp_rfs.py

Please have a look @samaloney, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

RFSSpectrogram level and version properties raise KeyError on real CDF files

2 participants