Skip to content

Image analyzer#337

Draft
Parthiba-Hazra wants to merge 2 commits intomainfrom
ph/image-analysis
Draft

Image analyzer#337
Parthiba-Hazra wants to merge 2 commits intomainfrom
ph/image-analysis

Conversation

@Parthiba-Hazra
Copy link
Collaborator

@Parthiba-Hazra Parthiba-Hazra commented Mar 24, 2026

[Title]

📚 Description of Changes

Provide an overview of your changes and why they’re needed. Link to any related issues (e.g., "Fixes #123"). If your PR fixes a bug, resolves a feature request, or updates documentation, please explain how.

  • What Changed:
    (Describe the modifications, additions, or removals.)

  • Why This Change:
    (Explain the problem this PR addresses or the improvement it provides.)

  • Affected Components:
    (Which component does this change affect? - put x for all components)

  • Compose

  • K8s

  • Other (please specify)

❓ Motivation and Context

Why is this change required? What problem does it solve?

  • Context:
    (Provide background information or link to related discussions/issues.)

  • Relevant Tasks/Issues:
    (e.g., Fixes: #GitHub Issue)

🔍 Types of Changes

Indicate which type of changes your code introduces (check all that apply):

  • BUGFIX: Non-breaking fix for an issue.
  • NEW FEATURE: Non-breaking addition of functionality.
  • BREAKING CHANGE: Fix or feature that causes existing functionality to not work as expected.
  • ENHANCEMENT: Improvement to existing functionality.
  • CHORE: Changes that do not affect production (e.g., documentation, build tooling, CI).

🔬 QA / Verification Steps

Describe the steps a reviewer should take to verify your changes:

  1. (Step one: e.g., "Run make test to verify all tests pass.")
  2. (Step two: e.g., "Deploy to a Kind cluster with make create-kind && make deploy.")
  3. (Additional steps as needed.)

✅ Global Checklist

Please check all boxes that apply:

  • I have read and followed the CONTRIBUTING guidelines.
  • My code follows the code style of this project.
  • I have updated the documentation as needed.
  • I have added tests that cover my changes.
  • All new and existing tests have passed locally.
  • I have run this code in a local environment to verify functionality.
  • I have considered the security implications of this change.

Summary by Gitar

  • New CRD and API types:
    • Added ImageAnalysisResult CRD for storing dive analysis metrics, layer breakdown, and workload references
    • Defined spec/status fields including efficiency, wasted space, and file analysis data
  • Image analysis controller system:
    • Implemented periodic scan cycle with discovery, planning, job submission, and result collection phases
    • Added smart diff to skip recently analyzed images within scan window
    • Integrated scan state tracking and leader election support
  • Image discovery engine:
    • Scans cluster pods to extract container images, deduplicated by digest and grouped by node
    • Resolves workload owners (Deployment, StatefulSet, DaemonSet, Job, CronJob) from pod references
    • Filters by namespace and image patterns
  • Batch job orchestration:
    • Plans and submits Kubernetes batch Jobs with concurrency limits (cluster and per-node)
    • Builds Job specs with containerd socket mounting, workspace volumes, and configurable resources
    • Manages job lifecycle including TTL cleanup and old scan job deletion
  • Result collection and CRD persistence:
    • Parses NDJSON output from analyzer pods to extract dive results
    • Maps dive metrics to ImageAnalysisResult CRDs with threshold evaluation
    • Includes workload references, layer details, and wasted file listings (capped at 50)
  • Configuration and helpers:
    • Added environment-based config loader with validation for resource quantities and tolerations
    • Analyzer container (Dockerfile, bash script) for batch image analysis using dive and crane
    • Helm chart values and RBAC updates for image analysis permissions

This will update automatically on new commits.

activeDeadlineSeconds := int64(m.config.JobTimeoutMinutes * 60)

// backoffLimit from config.
backoffLimit := int32(m.config.MaxRetries)

Check failure

Code scanning / CodeQL

Incorrect conversion between integer types High

Incorrect conversion of an integer with architecture-dependent bit size from
strconv.Atoi
to a lower bit size type int32 without an upper bound check.

Copilot Autofix

AI 3 days ago

Generally, to fix this kind of issue you either (a) parse directly into the final integer type with an appropriate bit size, or (b) ensure that any narrowing conversion is preceded by an explicit upper/lower bound check that clamps or rejects out‑of‑range values. Here we can’t change the overall configuration loading semantics much, but we can safely bound MaxRetries at the point where it is converted to int32 for Kubernetes’ backoffLimit.

The best minimal fix is in internal/controller/image_job_manager.go inside buildJobObject. Instead of directly doing backoffLimit := int32(m.config.MaxRetries), we should sanitize m.config.MaxRetries into a local int32 with a small, reasonable upper bound. For example, treat any non‑positive value as “unset” and fall back to a default (e.g., 3), and cap excessively large values at a defined maximum (e.g., 10 or some other small constant) before converting. This avoids overflow, preserves existing behavior for normal values, and does not require changing config parsing or struct types.

Concretely:

  • In buildJobObject, replace the single line backoffLimit := int32(m.config.MaxRetries) with a small block:
    • Read m.config.MaxRetries into a local retries variable.
    • If retries <= 0, set it to a default (e.g., 1 or 3).
    • If retries > someMax, clamp it to someMax.
    • Then convert to int32.
  • No new imports are needed; we only use basic integer comparison and literals.
  • All other logic (timeouts, TTL, env vars) remains unchanged.

This guarantees that the int32 cast is safe regardless of the architecture or environment variable value and keeps behavior for normal, in‑range values effectively the same (other than defining sane bounds when misconfigured).

Suggested changeset 1
internal/controller/image_job_manager.go

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/internal/controller/image_job_manager.go b/internal/controller/image_job_manager.go
--- a/internal/controller/image_job_manager.go
+++ b/internal/controller/image_job_manager.go
@@ -318,8 +318,16 @@
 	// Compute activeDeadlineSeconds from config.
 	activeDeadlineSeconds := int64(m.config.JobTimeoutMinutes * 60)
 
-	// backoffLimit from config.
-	backoffLimit := int32(m.config.MaxRetries)
+	// backoffLimit from config. Clamp to a safe, reasonable range before converting to int32.
+	retries := m.config.MaxRetries
+	if retries <= 0 {
+		// Use a small, sane default if unset or negative.
+		retries = 1
+	} else if retries > 10 {
+		// Prevent overflow and unreasonable retry counts.
+		retries = 10
+	}
+	backoffLimit := int32(retries)
 
 	ttl := int32(ttlSecondsAfterFinished)
 
EOF
@@ -318,8 +318,16 @@
// Compute activeDeadlineSeconds from config.
activeDeadlineSeconds := int64(m.config.JobTimeoutMinutes * 60)

// backoffLimit from config.
backoffLimit := int32(m.config.MaxRetries)
// backoffLimit from config. Clamp to a safe, reasonable range before converting to int32.
retries := m.config.MaxRetries
if retries <= 0 {
// Use a small, sane default if unset or negative.
retries = 1
} else if retries > 10 {
// Prevent overflow and unreasonable retry counts.
retries = 10
}
backoffLimit := int32(retries)

ttl := int32(ttlSecondsAfterFinished)

Copilot is powered by AI and may make mistakes. Always verify output.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant