Skip to content

Add min_value parameter for column anomaly detection tests #2177

@caterinaandreu

Description

@caterinaandreu

Is your feature request related to a problem? Please describe.

When using column anomaly tests (e.g., column_anomalies, all_columns_anomalies) to monitor metrics like null_percent, the z-score-based anomaly detection can trigger false positives on small, practically insignificant values. For example, if a column historically has 0% nulls and suddenly has 0.5% nulls, the statistical deviation is large enough to flag as anomalous — but 0.5% nulls may be perfectly acceptable from a business perspective.

The existing ignore_small_changes parameter helps by gating on percentage deviation from the training mean, but it doesn't provide an absolute floor. If the mean is near zero, even tiny absolute changes produce large percentage deviations, making ignore_small_changes ineffective in this scenario.

Describe the solution you'd like
A new optional parameter min_value (numeric, supports decimals) for column_anomalies and all_columns_anomalies tests. When set, an anomaly will only be flagged as a failure if metric_value >= min_value. If the metric value is below this threshold, the test passes regardless of the z-score.

Describe alternatives you've considered

  • ignore_small_changes: Works as a relative threshold (percentage deviation from the mean), but breaks down when the training mean is near zero since any small absolute change becomes a large relative change.
  • anomaly_sensitivity: Adjusting the z-score threshold can reduce false positives generally, but it's not targeted — it relaxes detection across the board rather than setting a meaningful business-level floor.
  • fail_on_zero: Only addresses the specific case of zero values, not a configurable minimum.
    None of these provide a simple, absolute lower bound that maps directly to a business-meaningful threshold (e.g., "I don't care about nulls unless they exceed 5%").

Additional context
Add any other context or screenshots about the feature request here.

Would you be willing to contribute this feature?
Yes

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions