perf: buffer accumulation in _write_query_params() reduces f.write() calls#790
Draft
mykaul wants to merge 1 commit intoscylladb:masterfrom
Draft
perf: buffer accumulation in _write_query_params() reduces f.write() calls#790mykaul wants to merge 1 commit intoscylladb:masterfrom
mykaul wants to merge 1 commit intoscylladb:masterfrom
Conversation
bc1545f to
9b21d5b
Compare
Author
|
Just spitting this here: Honest answer: on its own, ~100ns per call is tiny. But context matters:
|
9b21d5b to
f2be2a8
Compare
Replace the per-parameter write_value(f, param) loop in _QueryMessage._write_query_params() with a buffer accumulation approach: list.append + b"".join + single f.write(). This reduces the number of f.write() calls from 2*N+1 to 1, which is significant for vector workloads with large parameters. Also removes the redundant ExecuteMessage._write_query_params() pass-through override to avoid extra MRO lookup per call. Includes 14 unit tests covering normal, NULL, UNSET, empty, large vector, and mixed parameter scenarios for both ExecuteMessage and QueryMessage. Includes a benchmark script (benchmarks/bench_execute_write_params.py).
f2be2a8 to
ac64459
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replace per-parameter
write_value(f, param)loops with buffer accumulation (list.append+b"".join+ singlef.write()), reducingf.write()calls from(2*N + 1)to 1 for N query parameters in the execute/query path.This supersedes the closed PR #788 (inlining approach). Buffer accumulation is strictly superior: it achieves equal or better speedups in every scenario while producing a smaller, cleaner diff.
Motivation
Every CQL query/execute call serializes query parameters via
write_value(f, param), which does 2f.write()calls per parameter (length prefix + data). For queries with vector embeddings (128–1536 dimensions), this creates many small writes per message.Buffer accumulation collects all bytes in a Python list and writes once, eliminating per-parameter function call overhead and reducing syscall-like overhead.
What changed
cassandra/protocol.py(2 hunks)_QueryMessage._write_query_params()— Buffer accumulation for the parameter loop. Local variable caching (_int32_pack,_parts_append) for Cython-friendly tight loop.ExecuteMessage._write_query_params()— Removed unnecessarysuper()pass-through override (now inherited directly from_QueryMessage).tests/unit/test_protocol.pyAdded 14 new test methods in
WriteQueryParamsBufferAccumulationTest:encode_messageround-trip throughProtocolHandlerbenchmarks/bench_execute_write_params.py(new)Standalone benchmark script for reproducibility.
Benchmark results
Environment: Python 3.14, Cython
.socompiled, 500K iterations, best of 5 runs.Comparison with PR #788 (inlining)
Implementation notes
list.append+b"".joinbenchmarked faster thanbytearray +=protocol.pyis Cython-compiled; optimization benefits both pure Python and Cython paths