Skip to content

[Blog] Fluss Rust SDK introduction blog#2934

Draft
fresh-borzoni wants to merge 1 commit intoapache:mainfrom
fresh-borzoni:fluss-rust-blog
Draft

[Blog] Fluss Rust SDK introduction blog#2934
fresh-borzoni wants to merge 1 commit intoapache:mainfrom
fresh-borzoni:fluss-rust-blog

Conversation

@fresh-borzoni
Copy link
Contributor

@fresh-borzoni fresh-borzoni commented Mar 25, 2026

Fluss Rust SDK blog

@fresh-borzoni fresh-borzoni marked this pull request as draft March 25, 2026 11:03
@fresh-borzoni
Copy link
Contributor Author

fresh-borzoni commented Mar 25, 2026

cc @luoyuxia @leekeiabstraction

Please, take a look at overall structure, diagrams and visuals are pending until we like the framing and content.

Copy link
Contributor

@leekeiabstraction leekeiabstraction left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the great post! I've added comments.


When you write a record, the call is synchronous: the record gets queued into a per-bucket batch without touching the network. A background sender task picks up ready batches and ships them as RPCs to the responsible TabletServers. This follows the same pattern as both the Fluss Java client and Kafka producers.

The caller gets back a `WriteResultFuture`. Await it to block until the server confirms, or drop it for fire-and-forget. Either way, the server acknowledges the write with acks=all by default, so dropping the future skips the client-side wait, not the durability guarantee.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not the durability guarantee.

Arguably e2e durability is impacted if writes continuously failing and retried due to transient error and then client side gets restarted. I can imagine that high e2e durability minded users might want to handle future failure e.g. writing into local dead letter queue. Pretty sure advanced user would understand the implication of this but maybe not something that we want to risk being misconstrued.


Batches ship automatically when they fill up or after a short timeout (100ms by default), so `flush()` isn't needed for data to reach the server. It's there for when you need to confirm that everything in flight has landed. If the write buffer fills up, new writes block until space frees up rather than silently consuming unbounded memory.

Fluss has two table types (primary key tables and log tables), and the Rust core has a writer for each: `UpsertWriter` for keyed upserts and deletes, `AppendWriter` for append-only log writes. Both support idempotent delivery, and `AppendWriter` can also accept Arrow `RecordBatch` directly if you already have columnar data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe mention partial updates as well?

Comment on lines +28 to +34
We built fluss-rust on this same idea. A single Rust core implements the full Fluss client protocol (Protobuf-based RPC, record batching with backpressure, background I/O, Arrow serialization, idempotent writes, SASL authentication) and exposes it to three languages:

- **Rust**: directly, as the `fluss-rs` crate
- **Python**: via [PyO3](https://pyo3.rs), the Rust-Python bridge
- **C++**: via [CXX](https://cxx.rs), the Rust-C++ bridge

To give a sense of proportion: the Rust core is roughly 40k lines, while the Python binding is around 5k and the C++ binding around 6k. The bindings handle type conversion, async runtime bridging, and memory ownership at the language boundary, but all the protocol logic, batching, Arrow codec, and retry handling live in the shared core.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should include a diagram on this. The section reads well, adding diagram reinforces the message (and captures attention!).

IMO maybe the diagram can have fluss + rust mascots and replace the banner? More informative and also reads / shares well on sites like LinkedIn.


The first is **DataFusion integration**. The Rust core already produces Arrow RecordBatches, which is exactly what DataFusion's table provider interface expects. Wiring the two together would let users run SQL queries directly over Fluss data from Rust or Python, without going through Flink.

The second is a **Fluss gateway service** built on top of the Rust core. Not every environment can load a native library. A lightweight Rust-based gateway could expose Fluss over HTTP or gRPC, making it accessible from any language or tool that can make a network call. The Rust SDK gives us the right foundation for that: a single process that handles the protocol, batching, and connection management, and serves multiple clients over a simple API.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally here, I think we should mention that the community is gearing towards moving fluss-rust into fluss to streamline release & development process and is a strong signal for community's commitment for fluss-rust's continued development.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants