Skip to content

Implement streaming response optimization for non-Next.js publisher proxy #563

@aram356

Description

@aram356

Context

The publisher proxy currently buffers the entire response body in memory before sending any bytes to the client. For a 222KB HTML page, peak memory is ~4x the response size and no bytes reach the client until all processing completes.

Spec

See streaming response design spec (PR #562).

Plan

See implementation plan (PR #562).

Phase 1: Make streaming pipeline chunk-emitting (PR #583)

Ships independently with immediate memory savings.

Phase 2: Stream responses to client via StreamingBody (PR #585)

Depends on Phase 1. Adds TTFB/TTLB improvement.

Phase 3: Make script rewriters fragment-safe

Depends on Phase 2. Removes the buffered fallback, enabling full streaming even with GTM/NextJS script rewriters active.

Acceptance Criteria

  • Streaming activates when Next.js is disabled and backend returns 2xx
  • Peak memory per request reduced from ~4x to constant (chunk buffer + parser state)
  • Client receives first body bytes after first processed chunk, not after full buffering
  • No regressions on static, auction, or discovery endpoints
  • Buffered fallback works correctly when post-processors are registered
  • (Phase 3) Streaming works even with GTM/NextJS script rewriters active

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions