Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 24 additions & 28 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -1,28 +1,24 @@
# AGENTS.md

## Working rules

- Inspect existing files before editing.
- Make minimal coherent changes.
- Prioritize an end-to-end runnable MVP over polish.
- Do not present the repo as production-ready.
- Run tests after code changes.

## Project focus

- Timestamped event streams
- Sliding-window aggregation
- Telemetry features
- Simple rule-based alerts
- Reproducible outputs from sample data

## Review guidelines

- Treat README and documentation mismatches against actual CLI/runtime behavior as high-priority findings.
- Check all input-format claims against the real loader implementation.
- Treat missing edge-case tests as important review findings when behavior depends on time parsing, window boundaries, or alert thresholds.
- Prefer correcting documentation to match real behavior unless the code path is accidental or deprecated.
- Flag alerting logic that is obviously too noisy for the bundled sample dataset.
- Prefer small, scoped fixes over broad refactors during PR review.
- Do not request production-grade features in a portfolio prototype unless the PR explicitly aims to add them.
- When reviewing plots, outputs, and examples, verify that referenced files and commands actually exist.
# AGENTS.md

## Working rules

- Inspect existing files before editing.
- Make minimal coherent changes.
- Prefer small, reviewable pull requests.
- Prioritize correctness, reproducibility, and README accuracy over polish.
- Do not present the repo as production-ready.

## Build and test

- Install: `python -m pip install -e .`
- Test: `pytest`
- Demo run: `python -m telemetry_window_demo.cli run --config configs/default.yaml`

## Review guidelines

- Treat README or docs mismatches against actual CLI/runtime behavior as important findings.
- Check input-format claims against the real loader implementation.
- Treat missing edge-case tests as important findings when behavior depends on time parsing, window boundaries, or alert thresholds.
- Flag alerting logic that is obviously too noisy for the bundled sample dataset.
- Prefer small, scoped fixes over broad refactors during review.
- Verify that referenced commands, files, and output artifacts actually exist.
214 changes: 112 additions & 102 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,107 +1,117 @@
# telemetry-lab

[![CI](https://github.com/stacknil/telemetry-lab/actions/workflows/ci.yml/badge.svg)](https://github.com/stacknil/telemetry-lab/actions/workflows/ci.yml)

# telemetry-lab
[![CI](https://github.com/stacknil/telemetry-lab/actions/workflows/ci.yml/badge.svg)](https://github.com/stacknil/telemetry-lab/actions/workflows/ci.yml)
Small portfolio prototypes for telemetry analytics, monitoring, and detection-oriented signal processing.

## What This Repo Is

`telemetry-window-demo` is a local Python CLI that turns timestamped event streams into:

- sliding-window feature tables
- cooldown-reduced rule-based alerts
- PNG timeline plots
- machine-readable run summaries

## Quick Run

```bash
python -m pip install -e .
python -m telemetry_window_demo.cli run --config configs/default.yaml
```

That command reads `data/raw/sample_events.jsonl` and regenerates:

- `data/processed/features.csv`
- `data/processed/alerts.csv`
- `data/processed/summary.json`
- `data/processed/event_count_timeline.png`
- `data/processed/error_rate_timeline.png`
- `data/processed/alerts_timeline.png`

With the bundled default sample, the current repo state produces:

- `41` normalized events
- `24` windows
- `12` alerts after a `60` second cooldown

Why it is worth a quick look:

- it shows a full telemetry path from raw events to operator-facing outputs
- the sample inputs and outputs are reproducible in-repo
- a second bundled scenario gives a slightly richer walkthrough without changing the basic CLI flow

![Default alert timeline](data/processed/alerts_timeline.png)

## Demo Variants

Default sample:
## Demos

- config: [`configs/default.yaml`](configs/default.yaml)
- input: `data/raw/sample_events.jsonl`
- outputs: `data/processed/`
- current summary: `41` events, `24` windows, `12` alerts, `summary.json` included
- [telemetry-window-demo](#telemetry-window-demo)
- [ai-assisted-detection-demo](demos/ai-assisted-detection-demo/README.md)

Richer sample:
| Demo | Input | Deterministic core | LLM role | Main artifacts | Guardrails / non-goals |
| --- | --- | --- | --- | --- | --- |
| [telemetry-window-demo](#telemetry-window-demo) | JSONL / CSV events | Windows<br>Features<br>Alert thresholds | None | `features.csv`<br>`alerts.csv`<br>`summary.json`<br>3 PNG plots | MVP only<br>No realtime<br>No case management |
| [ai-assisted-detection-demo](demos/ai-assisted-detection-demo/README.md) | JSONL auth / web / process | Normalize<br>Rules<br>Grouping<br>ATT&CK mapping | JSON-only case drafting | `rule_hits.json`<br>`case_bundles.json`<br>`case_summaries.json`<br>`case_report.md`<br>`audit_traces.jsonl` | Human verification required<br>No autonomous response<br>No final verdict |

- config: [`configs/richer_sample.yaml`](configs/richer_sample.yaml)
- input: `data/raw/richer_sample_events.jsonl`
- outputs: `data/processed/richer_sample/`
- current summary: `28` events, `24` windows, `8` alerts, `summary.json` included

## Input Support

Runtime input support:

- `.jsonl`
- `.csv`

Required fields for both formats on every row or record:

- `timestamp`
- `event_type`
- `source`
- `target`
- `status`

Cooldown behavior:

- repeated alerts are keyed by `(rule_name, scope)`
- scope prefers the first available entity-like field in this order: `entity`, `source`, `target`, `host`
- when no entity-like field is present, cooldown falls back to per-`rule_name` behavior

## Repo Guide

- [`docs/sample-output.md`](docs/sample-output.md) summarizes the committed sample artifacts
- [`docs/roadmap.md`](docs/roadmap.md) sketches the next demo directions
- [`data/processed/summary.json`](data/processed/summary.json) captures the default run in machine-readable form
- [`data/processed/richer_sample/summary.json`](data/processed/richer_sample/summary.json) captures the richer scenario pack
- [`tests/`](tests/) keeps regression coverage close to the CLI behavior and windowing logic

## Next Demo Directions

- strengthen JSONL and CSV validation so ingestion failures are clearer
- keep reducing repeated alert noise while preserving simple rule-based behavior
- keep sample-output docs and public repo presentation aligned with the checked-in demo state

## Scope

This repository is a portfolio prototype, not a production monitoring system.

## Limitations

- No real-time ingestion
- No streaming state management
- No alert routing or case management
- No dashboard or service deployment
- Sample-data driven only
## What This Repo Is

`telemetry-window-demo` is a local Python CLI that turns timestamped event streams into:

- sliding-window feature tables
- cooldown-reduced rule-based alerts
- PNG timeline plots
- machine-readable run summaries

## Quick Run

```bash
python -m pip install -e .
python -m telemetry_window_demo.cli run --config configs/default.yaml
```

That command reads `data/raw/sample_events.jsonl` and regenerates:

- `data/processed/features.csv`
- `data/processed/alerts.csv`
- `data/processed/summary.json`
- `data/processed/event_count_timeline.png`
- `data/processed/error_rate_timeline.png`
- `data/processed/alerts_timeline.png`

With the bundled default sample, the current repo state produces:

- `41` normalized events
- `24` windows
- `12` alerts after a `60` second cooldown

Why it is worth a quick look:

- it shows a full telemetry path from raw events to operator-facing outputs
- the sample inputs and outputs are reproducible in-repo
- a second bundled scenario gives a slightly richer walkthrough without changing the basic CLI flow

![Default alert timeline](data/processed/alerts_timeline.png)

## Demo Variants

Default sample:

- config: [`configs/default.yaml`](configs/default.yaml)
- input: `data/raw/sample_events.jsonl`
- outputs: `data/processed/`
- current summary: `41` events, `24` windows, `12` alerts, `summary.json` included

Richer sample:

- config: [`configs/richer_sample.yaml`](configs/richer_sample.yaml)
- input: `data/raw/richer_sample_events.jsonl`
- outputs: `data/processed/richer_sample/`
- current summary: `28` events, `24` windows, `8` alerts, `summary.json` included

## Input Support

Runtime input support:

- `.jsonl`
- `.csv`

Required fields for both formats on every row or record:

- `timestamp`
- `event_type`
- `source`
- `target`
- `status`

Cooldown behavior:

- repeated alerts are keyed by `(rule_name, scope)`
- scope prefers the first available entity-like field in this order: `entity`, `source`, `target`, `host`
- when no entity-like field is present, cooldown falls back to per-`rule_name` behavior

## Repo Guide

- [`docs/sample-output.md`](docs/sample-output.md) summarizes the committed sample artifacts
- [`docs/roadmap.md`](docs/roadmap.md) sketches the next demo directions
- [`data/processed/summary.json`](data/processed/summary.json) captures the default run in machine-readable form
- [`data/processed/richer_sample/summary.json`](data/processed/richer_sample/summary.json) captures the richer scenario pack
- [`tests/`](tests/) keeps regression coverage close to the CLI behavior and windowing logic

## Next Demo Directions

- strengthen JSONL and CSV validation so ingestion failures are clearer
- keep reducing repeated alert noise while preserving simple rule-based behavior
- keep sample-output docs and public repo presentation aligned with the checked-in demo state

## Scope

This repository is a portfolio prototype, not a production monitoring system.

## Limitations

- No real-time ingestion
- No streaming state management
- No alert routing or case management
- No dashboard or service deployment
- Sample-data driven only
124 changes: 124 additions & 0 deletions demos/ai-assisted-detection-demo/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,124 @@
# AI-Assisted Detection Demo

This demo is part of `telemetry-lab` and is intentionally framed as a portfolio-grade security engineering prototype.

It demonstrates constrained AI-assisted case drafting for SOC-style workflows, not autonomous detection or response.

It combines deterministic detections with a tightly constrained LLM stage:

- the rules decide which activity is interesting
- the grouping logic decides which hits belong in the same case
- the LLM is limited to structured summaries, likely causes, uncertainty notes, and suggested next steps

The LLM does **not** make final incident decisions, modify rules, call tools, or execute response actions. Human verification is always required.

## Purpose

The goal is to show a credible bridge between deterministic telemetry analytics and safe analyst assistance.

This is not an autonomous SOC. It is a constrained drafting pipeline that keeps rule logic, ATT&CK mapping, case grouping, and evidence handling deterministic.

## Pipeline

1. ingest sample auth, web, and process events from JSONL
2. normalize them into a shared internal schema
3. apply deterministic detection rules
4. group rule hits into cases by shared entities and time proximity
5. attach ATT&CK mappings from rule metadata
6. build a case bundle with raw evidence, rule hits, severity, and evidence highlights
7. pass the case bundle to a constrained local demo LLM adapter with strict instruction and data separation
8. require JSON-only output against a local schema
9. validate the response and reject invalid output
10. emit analyst-facing artifacts and audit traces

## Guardrails

- telemetry content is marked as untrusted data
- system instructions are separated from the evidence payload
- the response must pass local JSON schema validation
- the response must pass a semantic validation layer after schema validation
- `human_verification` is required and must be `required`
- no external tool use is allowed in the LLM stage
- no automated response actions are allowed
- forbidden action-taking or final-verdict language is rejected and recorded
- summaries are rejected if the returned `case_id` does not exactly match the input case bundle
- a prompt-injection-like sample event is included and treated as telemetry, not instruction
- rejected summaries are fail-closed: they do not enter `case_summaries.json`
- accepted and rejected outcomes are both recorded in `audit_traces.jsonl`

## Quick start

From the repository root:

```bash
python -m pip install -e .
python -m telemetry_window_demo.cli run-ai-demo
```

Generated artifacts are written to `demos/ai-assisted-detection-demo/artifacts/`.

## Demo inputs

- sample data: `data/raw/sample_security_events.jsonl`
- deterministic rules: `config/rules.yaml`
- structured output schema: `config/llm_case_output_schema.json`

## Expected artifacts

- `artifacts/rule_hits.json`
- `artifacts/case_bundles.json`
- `artifacts/case_summaries.json`
- `artifacts/case_report.md`
- `artifacts/audit_traces.jsonl`

The bundled sample data is designed to produce at least three generated cases.

## Artifact semantics

- `rule_hits.json`: deterministic rule hits with rule metadata, ATT&CK mapping, entities, and evidence highlights
- `case_bundles.json`: grouped cases with severity, rule hits, ATT&CK mappings, raw evidence, and untrusted-data marking
- `case_summaries.json`: only accepted JSON summaries that passed schema and semantic validation
- `case_report.md`: analyst-facing report that shows accepted summaries and explicitly notes rejected case summaries
- `case_report.md`: includes a top-level run integrity section that surfaces rule/config degradation
- `audit_traces.jsonl`: stable per-record audit log for accepted and rejected paths, using `schema_version = ai-assisted-detection-audit/v1` and including `ts`, `case_id`, `validation_status`, `rejection_reason`, `rule_ids`, `prompt_input_digest`, `evidence_digest`, and bounded response excerpts

## Rejection behavior

- non-JSON or malformed JSON responses are rejected and recorded
- missing required fields or invalid enum values are rejected and recorded
- schema-valid summaries with the wrong `case_id` are rejected and recorded
- action-taking language is rejected
- final-verdict or confirmed-compromise language is rejected
- malformed rule or ATT&CK metadata is rejected before detection logic uses it

Rejected outputs do not become analyst summaries. Analysts can still inspect deterministic evidence through `case_bundles.json`, `case_report.md`, and `audit_traces.jsonl`.

## Reviewer walkthrough

### Accepted summary path

Use the default sample run artifacts in `artifacts/case_summaries.json`, `artifacts/case_report.md`, and `artifacts/audit_traces.jsonl`.

Verify that `CASE-001` appears in all three places, that the `case_id` matches exactly, that `human_verification` is `required`, and that the audit record shows `validation_status = accepted` with `schema_version = ai-assisted-detection-audit/v1`.

### Rejected summary path

Run `pytest tests/test_ai_assisted_detection_demo.py -k "audit_traces_capture_accepted_and_rejected_paths or case_id_mismatch"` and inspect the `case_report.md`, `case_summaries.json`, and `audit_traces.jsonl` artifacts written by the test.

Verify that the rejected case is absent from `case_summaries.json`, appears in `case_report.md` as `Summary status: rejected`, and has an audit record with `validation_status = rejected` plus a concrete `rejection_reason` such as `missing_required_fields`, `semantic_validation_failed`, or `case_id_mismatch`.

### Degraded coverage path

Run `pytest tests/test_ai_assisted_detection_demo.py -k malformed_attack_metadata_is_rejected_and_recorded` and inspect the generated `case_report.md` and `audit_traces.jsonl`.

Verify that `case_report.md` exposes `## Run Integrity`, `coverage_degraded: yes`, and the rejected rule id, and that `audit_traces.jsonl` contains a global rejection record with `case_id = null` and `rejection_reason = rule_metadata_validation_failed`.

## Limitations

- the LLM stage is a constrained local demo adapter, not a production model integration
- detections are intentionally small and rule-based
- grouping is simple and optimized for readability over recall
- sample telemetry is synthetic and limited in volume
- there is no ticketing, SOAR, sandboxing, or live data ingestion
- artifacts are for analyst review only and do not represent final incident disposition
- rejection logic is intentionally conservative and favors fail-closed behavior over model flexibility
1 change: 1 addition & 0 deletions demos/ai-assisted-detection-demo/artifacts/.gitkeep
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@

Loading
Loading