Rewrite bk job log with Parquet-backed reads, follow mode, and URL input by mekenthompson · Pull Request #720 · buildkite/cli

mekenthompson · 2026-03-25T12:28:56Z

Summary

Rewrites bk job log (aliased as bk logs) on top of the buildkite-logs Parquet library, bringing the MCP server's log capabilities to the CLI. This matters now because LLM-based tools increasingly reach for CLI commands when an MCP server isn't explicitly configured -- and bk logs is about to become a dependency for official Buildkite agentic skills shipping shortly.

The command is modeled after kubectl logs, docker logs, fly logs, and railway logs while handling Buildkite-specific realities: step keys, parallel job matrices, grouped log sections, and the copy-paste-a-URL-from-Slack workflow that CI debugging actually starts with.

Smart defaults mean zero flags for the common case:

Pipeline and build resolve automatically from the git repo and current branch
Single-job builds auto-select the job; multi-job builds show a picker
Running jobs auto-follow in a TTY (with a stderr notice); finished jobs dump through a pager
Color and pager disabled automatically when piped
Spinner and interactive prompts suppressed in non-TTY

Flags are opt-in for power use cases: --tail N, --follow, --since/--until, --seek/--limit, --step, --group, --json, --timestamps. Designed to compose with standard Unix tools:

bk logs -f | grep -i "error\|panic"           # live search
bk logs --json -n 100 | jq '.content'          # structured extraction
bk logs --since 5m | tail -20                   # recent output
bk logs <slack-url> -n 50                       # paste and go

Job to be done

A developer's build just failed. They got a Slack notification with a Buildkite URL. They want to see what went wrong without leaving their terminal, without copy-pasting UUIDs, and without downloading a 10MB log just to look at the last 20 lines.

What changed

Buildkite URL as input -- Copy a URL from Slack or the web UI, paste it as the argument. bk logs https://buildkite.com/org/pipe/builds/123#job-id extracts everything. Build-only URLs open the job picker. Slack's <angle-bracket> wrapping is stripped automatically.

Follow mode -- bk logs -f polls every 2s, streams new lines as they appear, and exits when the job reaches a terminal state. When you run bk logs with no flags on a running job in a TTY, it auto-follows and tells you on stderr.

Tail -- bk logs -n 50 shows the last 50 lines without downloading the full log. Combines with --follow (show last N then stream) and --since (last N lines within a time window).

Time filtering -- --since 5m and --until <RFC3339> filter by timestamp. Works across all modes.

Parallel step disambiguation -- --step test on a build with parallelism: 5 now shows a picker with parallel indices instead of silently returning the first match.

JSON output -- --json emits JSONL. Old --yaml/--text/-o flags removed (they were inherited from OutputFlags and silently ignored).

Typed errors -- Flag conflicts exit 2 with "Validation Error:". Missing jobs/builds exit 4 with "Not Found:" and suggestions. API failures exit 3 with status-code-specific messages.

Bug fix -- --follow --tail N on a job with zero log output crashed on SeekToRow(0) against an empty Parquet file. Fixed with a row count guard.

Use cases tested against live Buildkite builds

Paste a full job URL from the web UI
Paste a build-only URL, pick job interactively
Paste a Slack-wrapped <URL>
--step build on a multi-step pipeline
--step nonexistent (exit 4, actionable error)
-n 5 on a finished job
-n 3 -f on a running job (tail then stream)
-f on a finished job (dump log, exit in <2s)
-f on a running job (stream lines every 2s, exit when done)
--json | jq '.content'
--json --since <timestamp> | jq -r '@tsv'
--since 1h on a build from days ago (empty, exit 0)
--since <mid-build-timestamp> -n 3
--seek 100 --limit 5
--timestamps (RFC3339 prefix)
Pipe to grep (no pager, no color)
-n 1000 when log has 37 lines (shows all 37)
Wrong pipeline (exit 3, "404 No pipeline found")
Wrong build number (exit 3, "404 Not Found")
Nonexistent job UUID in URL (exit 3, "job not found")
--yaml flag (rejected, suggests --tail)
URL + --pipeline (exit 2, "cannot use --pipeline with a URL")

Edge cases handled

Empty job (0 log rows): "No log output for this job." exit 0
Follow + tail on empty job: no crash, polls until output appears
Uppercase UUIDs in URL fragment
URLs with query params (rejected, won't false-match)
URLs with trailing slashes or extra path segments (rejected)
Empty URL fragments (rejected)
Double-pasted URLs (rejected)
Markdown-wrapped URLs (rejected)
--timestamps with --json (JSON always includes timestamps, flag is a no-op but doesn't error)
Multiple bk;t= markers in a single line (all stripped)
Follow mode: tolerates up to 10 consecutive API errors before giving up
Follow mode: Ctrl-C exits cleanly (exit 0, no error)
--no-input with multiple jobs and no job ID: clear error instead of hanging on a prompt

Test plan

go test ./cmd/job/ -- 99 tests, all passing
go test ./... -- full suite green
go build . -- compiles, --help output correct
mise run format -- clean
mise run lint -- 0 issues
Live tested against competitor-intelligence/starter-pipeline and competitor-intelligence/competitor-intelligence-report builds
Triggered builds with slow output (30 lines over 60s) to verify follow mode streams in real time
Verified pager skipped when piped, auto-follow skipped for finished jobs

🤖 Generated with Claude Code

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

… and typed errors The old `bk job log` fetched the entire log via REST and dumped it through a pager. Fine for small jobs, useless for a 50,000-line test suite failure at 2am. This rewrites the command on top of the buildkite-logs library (same backend as the MCP server), which downloads logs once, converts to Parquet, and caches locally for fast columnar reads. This brings feature parity between the CLI and the MCP server for log access -- increasingly important as LLM-based tools bias toward CLI commands when MCP isn't explicitly configured. This will also be a dependency for official Buildkite agentic skills shipping shortly. What changed: - Read/tail/follow modes: full log with pager, --tail N for last N lines, --follow polls every 2s for running jobs and exits when the job finishes. Auto-follow when TTY + running job + no explicit flags. - Buildkite URL input: paste a URL from the web UI or Slack and it extracts org/pipeline/build/job. Handles <angle-bracket> Slack wrapping. Build-only URLs (no #fragment) fall through to the job picker. - Step key resolution with parallel matrix support: --step test picks the job by pipeline.yml key. When multiple parallel jobs match the same key, shows the interactive picker instead of silently returning the first. - Time filtering: --since 5m, --until 2026-01-15T10:00:00Z, or both. Works with tail, read, and follow modes. Duration values pin to invocation time so filtering is deterministic across the log. - JSON output: --json emits one JSON object per line (JSONL) with row_number, timestamp, content, and group. Replaces the old OutputFlags embed that exposed --yaml/--text/--output flags which silently did nothing. - Typed errors: all user-facing errors now use the CLI's error type system. Flag conflicts exit 2 (validation), missing resources exit 4 (not found), API failures exit 3 with status-code-specific messages and suggestions. - Group filtering: --group "Running tests" shows only log lines within a Buildkite --- group section. - Pager integration: full-log reads go through less -R (respects PAGER env, --no-pager, and config). Tail, follow, and JSON skip the pager. Non-TTY disables pager, color, auto-follow, and the spinner. Bug fix: follow mode with --tail on a job with 0 log rows crashed because SeekToRow(0) failed on an empty Parquet file. Added a row count guard. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

mcncl · 2026-03-25T20:32:39Z

cmd/job/log.go

+  $ bk logs https://buildkite.com/my-org/my-pipeline/builds/123#0190046e-e199-453b-a302-a21a4d649d31
+
+  # Build URL without job fragment (opens job picker)
+  $ bk logs https://buildkite.com/my-org/my-pipeline/builds/123


I assmume bk job log is preferred?

Good catch. I want bk logs to be the first-class command, same as kubectl logs / docker logs / fly logs. bk job log stays for compatibility but bk logs is what we promote. Switching all help examples to use bk logs.

mcncl · 2026-03-25T20:36:23Z

cmd/job/log.go

+	}
+}
+
+func (c *LogCmd) validateFlags() error {


Should this function check that --group and --seek have not been used together?

if c.Seek >= 0 && c.Group != "" {...}

You're right, --seek silently wins and --group gets dropped. Adding a validation error. We could compose them but it's not clear what "seek within a group" means, and nobody's asked for it.

mcncl · 2026-03-25T20:41:45Z

cmd/job/log_test.go

+			Content:   "hello",
+			Timestamp: 1000,
+			RowNumber: 0,


This is a test to strip out ANSI but the content contains no ANSI

Yep, test passes trivially with no ANSI in the input. Updating to include actual escape codes so CleanContent(true) is exercised.

mcncl · 2026-03-25T23:25:37Z

cmd/job/log.go

+		Content:   strings.TrimRight(entry.CleanContent(true), "\n"),
+		Group:     entry.Group,
+	}
+	data, _ := json.Marshal(obj)


Should we do something with the error here? Maybe in debug mode at least?

Can't actually fail with these types (string + int64), but swallowing the error reads wrong. Adding an early return with a stderr warning.

mcncl · 2026-03-25T23:26:14Z

cmd/job/log_test.go

+func TestBuildJobLabelsParallelIndex(t *testing.T) {
+	t.Parallel()
+
+	idx0, idx1, idx2 := 0, 1, 2


What do these do as they're ignored later?

Dead code, leftover from an earlier approach. Deleted.

…x tests - Use `bk logs` consistently in help examples (first-class command, `bk job log` kept for compatibility) - Add --seek/--group mutual exclusivity check to validateFlags() - Fix ANSI strip test to include actual escape codes in input - Handle json.Marshal error with stderr warning instead of swallowing - Remove unused idx0/idx1/idx2 variables from parallel index test Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

go.sum resolved via go mod tidy. Re-exported isTTY as IsTTY in internal/io/pager.go since it was unexported by an upstream change but is needed by cmd/job/log.go for auto-follow TTY detection. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

mipearson

Comments courtesy the code-review skill in amp (except the one about the PRD), cross-referenced against opus 4.6 & gpt 5.4 to make sure, and de-duped against Ben's findings.

robots on robots on robots.

mipearson · 2026-03-28T01:06:16Z

cmd/job/log.go

+
+	startRow := max(fileInfo.RowCount-int64(c.Tail), 0)
+
+	for entry, iterErr := range reader.SeekToRow(startRow) {


Bug: --tail without time filters ignores --group. This path uses SeekToRow(startRow) which reads raw rows with no group filtering. The time-filter branch above correctly uses FilterByGroupIter (line 562), but this branch doesn't.
bk logs --tail 20 --group "Running tests" will return the last 20 lines of the entire log, not the last 20 lines of the "Running tests" group.

mipearson · 2026-03-28T01:06:42Z

cmd/job/log.go

+		lastSeenRow = fileInfo.RowCount
+	} else {
+		// Show everything from the beginning (respecting --since if set)
+		for entry, iterErr := range reader.ReadEntriesIter() {


Bug: --group filter is not applied in follow mode. Both the initial fetch (lines 613 and 625 use SeekToRow/ReadEntriesIter directly) and the polling loop (line 679 uses SeekToRow) emit all entries regardless of group.
bk logs -f --group "tests" will print all log output, not just entries from the "tests" group.

mipearson · 2026-03-28T01:07:00Z

cmd/job/log.go

+	reqCtx, cancel := context.WithTimeout(ctx, 30*time.Second)
+	defer cancel()
+
+	buildInfo, _, err := f.RestAPIClient.Builds.Get(reqCtx, org, pipeline, build, nil)


Nit: jobState calls Builds.Get which fetches the entire build including all jobs. In follow mode this runs every 2 seconds (line 693). For builds with high parallelism this is a lot of payload to fetch repeatedly just to check one job's state. Not blocking, but worth noting - if go-buildkite ever adds a single-job endpoint, this would be a good candidate.

mipearson · 2026-03-28T01:11:13Z

humble-yawning-bear.md

@@ -0,0 +1,301 @@
+# PRD: Enhanced `bk job log` Command


Should this remain in this repository, and if so, where should it live? Probably not the root directory - docs/prds maybe?

mekenthompson and others added 2 commits March 25, 2026 23:27

Add internal/logs client wrapper for buildkite-logs library

d7c3a89

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

mekenthompson requested review from a team as code owners March 25, 2026 12:28

mcncl reviewed Mar 25, 2026

View reviewed changes

mekenthompson and others added 2 commits March 28, 2026 09:36

mipearson reviewed Mar 28, 2026

View reviewed changes


		startRow := max(fileInfo.RowCount-int64(c.Tail), 0)

		for entry, iterErr := range reader.SeekToRow(startRow) {

Conversation

mekenthompson commented Mar 25, 2026

Summary

Job to be done

What changed

Use cases tested against live Buildkite builds

Edge cases handled

Test plan

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mipearson left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants