Skip to content

New condition for compressing data#148

Merged
KriFos1 merged 3 commits intoPython-Ensemble-Toolbox:mainfrom
kjei:main
Mar 27, 2026
Merged

New condition for compressing data#148
KriFos1 merged 3 commits intoPython-Ensemble-Toolbox:mainfrom
kjei:main

Conversation

@kjei
Copy link
Copy Markdown
Contributor

@kjei kjei commented Mar 27, 2026

New keyword compress_data in COMPRESS in the input file for specifying the data type that should be compressed. In the old version the estimated noise level from the compression was never used.

New keyword compress_data in COMPRESS in the input file for specifying the data type that should be compressed. In the old version the estimated noise level from the compression was never used.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the wavelet-compression flow to compress only the observation/prediction data type specified by a new compress_data keyword, and to use the compression-estimated noise when setting data variance.

Changes:

  • Add compress_data to the sparse representation configuration extracted from the COMPRESS block.
  • Switch compression triggers in observation/prediction organization from “size matches mask” to “datatype matches compress_data”.
  • Override DATAVAR using wavelet-estimated noise (est_noise**2) for the compressed datatype.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
src/pipt/misc_tools/extract_tools.py Adds compress_data to parsed sparse configuration.
src/pipt/loop/ensemble.py Changes when obs data is compressed and when datavar is overridden by wavelet-estimated noise.
src/pipt/loop/assimilation.py Updates forecast post-processing to select the compressed datatype by name and recompress predicted data.
Comments suppressed due to low confidence (2)

src/pipt/loop/assimilation.py:486

  • In post_process_forecast, vintage is reset to 0 inside the for key in pred_data: loop and then incremented, but it isn’t used after that increment. This looks like leftover logic from the previous mask/size-based condition and can be removed to avoid confusion about how vintages are tracked.
                    # Reset vintage
                    vintage = 0

                    # Store according to sparse_info
                    if key == self.ensemble.sparse_info['compress_data'] and pred_data[key] is not None:
                        # If first entry in pred_data_tmp
                        if pred_data_tmp[i] is None:
                            pred_data_tmp[i] = {key: pred_data[key]}
                        else:
                            pred_data_tmp[i][key] = pred_data[key]

                        # Update vintage
                        vintage += 1

src/pipt/loop/ensemble.py:524

  • _org_data_var now overrides datavar based solely on datatype[j] == self.sparse_info['compress_data'] and then indexes self.sparse_data[vintage]. If the corresponding true-data compression didn’t run (e.g., non-.npz input / N/A / use_ensemble mode) or if vintage exceeds the number of stored sparse_data entries, this will raise IndexError and/or set a variance vector with a different length than obs_data. Add a guard that ensures the vintage exists and that the estimated-noise vector length matches the (compressed) observation length before overriding.
                if self.sparse_info is not None and self.datavar[i][datatype[j]] is not None and \
                            datatype[j]==self.sparse_info['compress_data']:
                    # compute var from sparse_data
                    est_noise = np.power(self.sparse_data[vintage].est_noise, 2)
                    self.datavar[i][datatype[j]] = est_noise  # override the given value
                    vintage = vintage + 1

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

KriFos1 and others added 2 commits March 27, 2026 13:11
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@KriFos1 KriFos1 merged commit 3800e9e into Python-Ensemble-Toolbox:main Mar 27, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants