Skip to content

Phase 1: Make streaming pipeline chunk-emitting#583

Open
aram356 wants to merge 21 commits intospecs/streaming-response-optimizationfrom
feature/streaming-pipeline-phase1
Open

Phase 1: Make streaming pipeline chunk-emitting#583
aram356 wants to merge 21 commits intospecs/streaming-response-optimizationfrom
feature/streaming-pipeline-phase1

Conversation

@aram356
Copy link
Copy Markdown
Collaborator

@aram356 aram356 commented Mar 26, 2026

Summary

Make the internal streaming pipeline truly chunk-emitting so processors emit output incrementally instead of buffering entire documents. This is the foundation for Phase 2 (streaming responses to the client via StreamingBody) but ships independently with immediate memory and correctness benefits.

Closes #568, closes #569, closes #570, closes #571, closes #572.
Part of epic #563.

What changed

Encoder finalization fix (process_through_compression):

  • Changed signature from W (by value) to &mut W so callers retain ownership
  • Replaced flush() + drop(encoder) with explicit encoder.finish() / encoder.into_inner()
  • Finalization errors now propagate instead of being silently swallowed
  • Fixes pre-existing correctness issue for deflate/brotli, prevents regression when gzip moves to this path

All compression paths now chunk-based:

  • process_gzip_to_gzip: replaced read_to_end() with process_through_compression pattern
  • decompress_and_process (used by *_to_none paths): replaced read_to_end() with chunk read loop
  • All 6 compression paths now follow the same structure: read chunk → process → write → repeat

Unified process_chunks method:

  • Extracted shared chunk read/process/write loop from process_uncompressed, process_through_compression, and decompress_and_process into a single process_chunks method
  • Reduces 3 near-identical implementations to 1

HtmlRewriterAdapter incremental streaming:

  • lol_html rewriter created eagerly in constructor with Rc<RefCell<Vec<u8>>> shared output sink
  • process_chunk() now emits output on every call, not just is_last
  • Adapter is single-use (reset() is a no-op — Settings consumed by constructor)

HtmlWithPostProcessing adaptation:

  • Added output accumulation buffer for the is_last chunk when post-processors are registered
  • Intermediate chunks pass through directly; final chunk triggers post-processing on accumulated output
  • When no post-processors are registered (the streaming path), it's a pure passthrough

Files changed

File Lines What
streaming_processor.rs +380 -329 All pipeline changes above
html_processor.rs +29 -1 Post-processor accumulation for is_last

Verification

  • cargo test --workspace — 748 passed, 0 failed
  • cargo clippy --workspace --all-targets --all-features -- -D warnings — clean
  • cargo fmt --all -- --check — clean
  • npx vitest run — 282 passed
  • cargo build --release --target wasm32-wasip1 — success

Test plan

  • Existing streaming_processor tests updated for incremental output
  • New deflate round-trip test
  • New gzip round-trip test (chunk-based)
  • New gzip-to-none test
  • HtmlRewriterAdapter per-chunk emission test
  • Full pipeline integration test with HTML rewriting
  • WASM build succeeds

aram356 added 17 commits March 26, 2026 09:08
…ation test

- Add debug-level logging to process_chunks showing total bytes
  read and written per pipeline invocation
- Add brotli-to-brotli round-trip test to cover the into_inner()
  finalization path
- Add test proving HtmlWithPostProcessing accumulates output when
  post-processors are registered while streaming path passes through
- Group std imports together (cell, io, rc) before external crates
- Document supported compression combinations on PipelineConfig
- Remove dead-weight byte counters from process_chunks hot loop
- Fix stale comment referencing removed process_through_compression
- Fix brotli finalization: use drop(encoder) instead of into_inner()
  to make the intent clear (CompressorWriter writes trailer on drop)
- Document reset() as no-op on HtmlRewriterAdapter (single-use)
- Add brotli round-trip test covering into_inner finalization path
- Add gzip HTML rewriter pipeline test (compressed round-trip with
  lol_html, not just StreamingReplacer)
- Add HtmlWithPostProcessing accumulation vs streaming behavior test
- Add Eq derive to Compression enum (all unit variants, trivially correct)
- Brotli finalization: use into_inner() instead of drop() to skip
  redundant flush and make finalization explicit
- Document process_chunks flush semantics: callers must still call
  encoder-specific finalization after this method returns
- Warn when HtmlRewriterAdapter receives data after finalization
  (rewriter already consumed, data would be silently lost)
- Make HtmlWithPostProcessing::reset() a true no-op matching its doc
  (clearing auxiliary state without resetting rewriter is inconsistent)
- Document extra copying overhead on post-processor path vs streaming
- Assert output content in reset-then-finalize test (was discarded)
- Relax per-chunk emission test to not depend on lol_html internal
  buffering behavior — assert concatenated correctness + at least one
  intermediate chunk emitted
- Add test that feeds multiple chunks through HtmlWithPostProcessing
  with an active post-processor (should_process returns true, mutates
  HTML). Verifies the post-processor receives the complete accumulated
  document and its mutations appear in the output.
- Make flush semantics per-codec explicit in process_chunks doc:
  flate2 needs finish() after flush, brotli is finalized by flush
@aram356 aram356 self-assigned this Mar 26, 2026
aram356 added 4 commits March 26, 2026 11:46
lol_html fragments text nodes across chunk boundaries when fed
incrementally. This breaks script rewriters (NextJS __NEXT_DATA__,
GTM) that expect complete text content — a split domain like
"google" + "tagmanager.com" would silently miss the rewrite.

Add dual-mode HtmlRewriterAdapter:
- new(): streaming mode, emits output per chunk (no script rewriters)
- new_buffered(): accumulates input, feeds lol_html in one write()
  call on is_last (script rewriters registered)

create_html_processor selects the mode based on whether
script_rewriters is non-empty. This preserves the old behavior
(single-pass processing) when rewriters need it, while enabling
streaming when they don't.

Add regression test proving lol_html does fragment text across
chunk boundaries, confirming the issue is real.
lol_html fragments text nodes across input chunk boundaries. Script
rewriters (NextJS __NEXT_DATA__, GTM) expect complete text content
and would silently miss rewrites on split fragments.

Add dual-mode HtmlRewriterAdapter:
- new(): streaming, emits output per chunk (no script rewriters)
- new_buffered(): accumulates input, single write() on is_last

create_html_processor selects mode based on script_rewriters. This
preserves correctness while enabling streaming for configs without
script rewriters. Phase 3 will make rewriters fragment-safe.

Add regression test proving lol_html does fragment text nodes.
Add section to spec explaining the lol_html text fragmentation issue,
the dual-mode HtmlRewriterAdapter workaround (Phase 1), and the
planned fix to make script rewriters fragment-safe (Phase 3, #584).
- Add post-finalization warning to buffered path (was only in streaming)
- Add buffered_adapter_prevents_text_fragmentation test proving
  new_buffered() delivers complete text to lol_html handlers
- Fix spec: html_processor.rs is changed (selects adapter mode), and
  script_rewriters do require buffered mode (not "unaffected")
@aram356 aram356 marked this pull request as ready for review March 27, 2026 23:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant