This project is extensively documented. When you don't know how to do something, start with the documentation, then verify against source code. You have access to both - use them together. The better you maintain the documentation as you work, the more valuable it becomes for future development.
The CloudZero Agent is a complex, multi-component Kubernetes integration system with multiple Go applications, extensive testing infrastructure, and comprehensive documentation. This guide helps you navigate and maintain that documentation system.
Documentation is organized in layers of increasing specificity as you descend the directory tree:
root/ # Project-wide documentation
├── README.md # Overview, components, message formats
├── DEVELOPMENT.md # Build system, testing, deployment
├── CONTRIBUTING.md # Contribution standards
├── CLAUDE.md # This file - AI navigation guide
│
├── app/ # Go application code
│ ├── README.md # Application architecture (hexagonal pattern)
│ ├── CLAUDE.md # AI-specific development guidance
│ ├── domain/ # Business logic layer
│ │ └── README.md # Domain patterns and usage
│ ├── functions/ # CLI applications
│ │ ├── collector/
│ │ │ └── README.md # Collector-specific docs
│ │ └── webhook/
│ │ └── README.md # Webhook-specific docs
│ └── types/ # Core interfaces
│ └── README.md # Interface contracts
│
├── helm/ # Kubernetes deployment
│ ├── README.md # Chart structure
│ ├── templates/ # Extensively commented YAML
│ ├── docs/ # Operational guides
│ │ ├── troubleshooting-guide.md
│ │ └── cert-trouble-shooting.md
│ └── tests/ # Helm unit tests (standard location)
│
├── tests/ # Most tests live here
│ ├── README.md # Testing philosophy and methodology
│ ├── helm/ # Helm integration tests
│ │ ├── README.md # Helm testing strategies
│ │ └── templates/ # Local test overrides (gitignored)
│ ├── kuttl/ # Kubernetes integration tests
│ │ └── README.md # KUTTL framework usage
│ └── integration/ # API integration tests
│ └── README.md # Integration test patterns
│
├── clusters/ # Multi-cluster deployment configs
│ └── README.md # Cluster configuration system
│
└── docs/ # Additional documentation
└── testing/ # Testing guides
Source Code Documentation:
- Go files: Fully documented with go doc comments
- Helm templates: Extensive YAML comments explaining logic
- Makefile: Commented targets and functions
Markdown Documentation:
- Root level: Project-wide concerns (README.md, DEVELOPMENT.md, CONTRIBUTING.md)
- Component level: Architecture and usage (app/README.md, helm/README.md)
- Subdirectory level: Specific implementations (app/domain/README.md, tests/kuttl/README.md)
MANDATORY: Every development task must follow a test-driven approach:
Core principles:
- Write tests first - Define expected behavior before implementation
- Add tests incrementally - Create tests for each new feature or bug fix
- Test at all levels - Unit, integration, Helm, and end-to-end
- Use tests for debugging - When issues arise, add tests to reproduce and verify fixes
Test coverage requirements:
- Go code: Unit tests with
gomockfor isolation, table-driven tests for scenarios - Helm charts: Unit tests (helm-unittest), schema validation, template tests, KUTTL e2e tests
- Integration: API tests, smoke tests, complete workflow tests
- Quality over quantity: Target >90% coverage with meaningful tests
DO NOT run tools directly. The Makefile ensures consistent tool versions, proper flags, and dependency handling:
# ✅ CORRECT - Use Makefile targets
make -j format lint test build
make test GO_TEST_TARGET=./app/domain
make generate
# ❌ WRONG - Direct tool invocation
go test ./...
go generate
golangci-lint runWhy Make is required:
- Ensures consistent tool versions and flags
- Handles dependencies and generated files correctly
- Provides parallelism with
-jflag - Integrates with project's build system properly
Parallelization: Use -j flag for parallel execution - the Makefile is designed for this:
make -j test-all # Run all tests in parallel
make -j format lint analyze # Multiple targets in parallelThe clusters/ directory provides a powerful multi-cluster deployment system:
# Deploy to specific cluster
CLUSTER_NAME=my-cluster make helm-install helm-wait
# Use KIND for local development (default)
make helm-install helm-wait
# Uninstall from cluster
CLUSTER_NAME=my-cluster make helm-uninstallAdd your own clusters: Create clusters/my-cluster.yaml and clusters/my-cluster-overrides.yaml (gitignored by default)
The tests/helm/templates/ directory gitignores files matching local-*.yaml:
# Create local test configuration
cat > tests/helm/templates/local-test-overrides.yaml <<EOF
components:
collector:
replicas: 3
EOF
# Generate manifest with overrides
make tests/helm/templates/local-test.yaml
# Diff against default to see from default installation
diff -u tests/helm/templates/manifest.yaml tests/helm/templates/local-test-manifest.yamlThis is an extremely powerful tool for understanding how configuration affects output.
When you don't know something:
- Check the relevant README - Start at the appropriate level (component, subdirectory)
- Review CLAUDE.md files - AI-specific guidance in relevant directories
- Review README.md and other markdown files - There is extensive documentation in markdown files in the tree
- Read the source code - Extensively commented with go doc comments and inline docs
- Check DEVELOPMENT.md - Build system, testing, and workflow patterns
- Examine Makefile targets - Many capabilities are built into the Makefile
Component READMEs often document:
- How to run tests for that specific component
- Useful Makefile variables (e.g.,
GO_TEST_TARGET,GO_TEST_FLAGS) - Component-specific testing tricks and configurations
- Common debugging approaches
Example Research Path for Adding a New Feature:
1. app/README.md → Understand hexagonal architecture
2. app/types/README.md → Review interface contracts
3. Similar component code → See established patterns
4. DEVELOPMENT.md → Understand testing requirements
5. Implement following patterns
CRITICAL: When making changes, documentation must be updated synchronously to avoid publishing outdated information.
For every code change, check documentation at all relevant levels:
flowchart TD
Change[Make Code Change] --> Check{Check Documentation}
Check -->|Source Level| Source[Update godoc comments<br/>Update YAML template comments<br/>Update inline documentation]
Check -->|Component Level| Component[Update README.md<br/>Update CLAUDE.md<br/>Update examples]
Check -->|System Level| System[Update root README.md<br/>Update DEVELOPMENT.md<br/>Update guides in docs/]
Source --> Validate[Verify Consistency]
Component --> Validate
System --> Validate
Validate --> Test[Test Examples]
Test --> Commit[Commit All Changes Together]
Before committing changes:
- Updated godoc comments if function behavior changed
- Updated YAML template comments if Helm logic changed
- Updated component README if architecture/usage changed
- Updated examples if interfaces changed
- Verified examples execute successfully
- Checked root README for outdated information
- Updated DEVELOPMENT.md if build/test workflows changed
When making changes, bubble up through the directory tree checking documentation at each level:
- Start at implementation - Update godoc/inline comments
- Check subdirectory - Update component README/CLAUDE.md
- Check parent directory - Update broader documentation
- Check root - Update project-wide documentation if needed
The repository produces a Helm Chart (helm/) that is mirrored to github.com/Cloudzero/cloudzero-charts - this is the primary user-facing artifact. Everything else serves development, testing, and build infrastructure:
root/
├── app/ # All Go application code
│ ├── domain/ # Business logic (hexagonal architecture core)
│ ├── functions/ # CLI applications (collector, shipper, webhook, validator)
│ ├── types/ # Core interfaces and contracts
│ ├── handlers/ # HTTP and request handlers
│ ├── storage/ # Data persistence layer
│ └── utils/ # Shared utilities
│
├── helm/ # Kubernetes deployment (USER-FACING)
│ ├── templates/ # Extensively commented K8s manifests
│ ├── tests/ # Helm unit tests (standard Helm location)
│ └── docs/ # Operational guides and troubleshooting
│
├── tests/ # Comprehensive testing infrastructure
│ ├── integration/ # API integration tests
│ ├── helm/ # Helm integration tests
│ │ └── templates/ # Test configurations (gitignored local-*.yaml)
│ ├── kuttl/ # Kubernetes end-to-end tests
│ ├── smoke/ # Production validation
│ └── load/ # Performance testing
│
├── clusters/ # Multi-cluster deployment configurations
├── docker/ # Container build definitions
├── docs/ # Additional documentation
├── scripts/ # Build and utility scripts
└── .tools/ # Development tool dependencies
Note that unit tests co-located with Go source as *_test.go files.
This project has extensive testing infrastructure covering multiple layers:
- Unit tests: Co-located with Go source files
- Integration tests:
tests/integration/- API integration - Helm tests:
helm/tests/(unit) andtests/helm/(integration) - Kubernetes tests:
tests/kuttl/- End-to-end cluster testing - Smoke tests:
tests/smoke/- Production validation - Load tests:
tests/load/- Performance testing
To run all tests: make -j test-all. WARNING: this is very slow, like > 10 minutes. Because it does a lot.
See tests/README.md for comprehensive testing methodology.
app/README.md- Architecture overviewapp/types/README.md- Interface contracts- Component-specific README - Detailed patterns
- Source code - See existing implementations
DEVELOPMENT.md- Testing requirements
helm/README.md- Chart structurehelm/docs/- Operational guidestests/helm/README.md- Testing strategies- Helm template comments - Inline documentation
DEVELOPMENT.md- Build workflows
helm/docs/troubleshooting-guide.md- Common issueshelm/docs/cert-trouble-shooting.md- Certificate issues- Component README files - Behavior documentation
DEVELOPMENT.md- Build and deployment
- Check component README first - Many directories document how to test that component
tests/README.md- Testing philosophy- Test type-specific README - Detailed guidance (e.g.,
tests/kuttl/README.md) DEVELOPMENT.md- Test execution commands- Component README - Component testing patterns and useful tricks
- This project is extremely well-documented - Research before guessing
- Always use Makefile targets - Never run tools directly
- Use
-jfor parallel execution - Everything is designed for this - Leverage cluster configurations - Easy multi-cluster deployment
- Use local override files - Powerful testing tool in
tests/helm/templates/ - Maintain documentation consistency - Update all levels when making changes
- Follow established patterns - Consistency is critical in complex systems
- Trust the inline documentation - Godoc and YAML comments are comprehensive
# Essential development cycle
make -j format lint test build
# Comprehensive testing (WARNING: slow, > 10 minutes)
make -j test-all
# Targeted testing (very useful - check component READMEs for more tricks)
GO_TEST_TARGET=./app/domain make test # Test specific package
GO_TEST_TARGET=./app/functions/collector make test # Test collector only
GO_TEST_FLAGS="-v -run TestSpecificFunction" make test # Specific test
# Specific test categories
make test # Go unit tests
make test-integration # API integration tests
make test-smoke # Smoke tests
make helm-test # All Helm validation tests
make helm-test-kuttl # Kubernetes end-to-end tests
# Helm operations
make helm-lint # Lint Helm chart
make helm-install helm-wait # Deploy and wait for ready
CLUSTER_NAME=my-cluster make helm-install # Deploy to custom cluster
make helm-uninstall # Remove from cluster
# Full workflow (creates cluster, tests, cleans up)
make kind-test # Complete KIND testing workflow
# Cleanup
make clean # Remove build artifacts
make kind-down # Delete KIND cluster
# Tools and help
make install-tools # Install development dependencies
make help # List all available targetsYou have access to extensive documentation and source code. Start with the documentation to understand the system, then verify your understanding against the source. The documentation exists at multiple levels - use the hierarchical structure to find what you need, and keep it up-to-date as you make changes. The more accurately you maintain it, the more valuable it becomes.